diff --git a/MANIFEST.in b/MANIFEST.in
new file mode 100644
index 0000000..b2ed5e5
--- /dev/null
+++ b/MANIFEST.in
@@ -0,0 +1,3 @@
+include *.txt
+include rl_coach/environments/CarlaSettings.ini
+include rl_coach/dashboard_components/spinner.css
diff --git a/README.md b/README.md
index 84291a8..a93f7a9 100644
--- a/README.md
+++ b/README.md
@@ -1,10 +1,10 @@
 # Coach
 
 [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/NervanaSystems/coach/blob/master/LICENSE)
-[![Docs](https://readthedocs.org/projects/pip/badge/?version=latest&style=flat)](http://NervanaSystems.github.io/coach/)
+[![Docs](https://media.readthedocs.org/static/projects/badges/passing-flat.svg)](https://nervanasystems.github.io/coach/)
 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1134898.svg)](https://doi.org/10.5281/zenodo.1134898)
 
-##  Overview
+<p align="center"><img src="img/coach_logo.png" alt="Coach Logo" width="200"/></p>
 
 Coach is a python reinforcement learning research framework containing implementation of many state-of-the-art algorithms.
 
@@ -36,7 +36,6 @@ Contacting the Coach development team is also possible through the email [coach@
   * [Usage](#usage)
     + [Running Coach](#running-coach)
     + [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization)
-    + [Parallelizing an Algorithm](#parallelizing-an-algorithm)
   * [Supported Environments](#supported-environments)
   * [Supported Algorithms](#supported-algorithms)
   * [Citation](#citation)
@@ -44,56 +43,69 @@ Contacting the Coach development team is also possible through the email [coach@
 
 ## Documentation
 
-Framework documentation, algorithm description and instructions on how to contribute a new agent/environment can be found [here](http://NervanaSystems.github.io/coach/).
+Framework documentation, algorithm description and instructions on how to contribute a new agent/environment can be found [here](https://nervanasystems.github.io/coach/).
 
 
 ## Installation
 
 Note: Coach has only been tested on Ubuntu 16.04 LTS, and with Python 3.5.
 
-### Coach Installer
+For some information on installing on Ubuntu 17.10 with Python 3.6.3, please refer to the following issue: https://github.com/NervanaSystems/coach/issues/54
 
-Coach's installer will setup all the basics needed to get the user going with running Coach on top of [OpenAI Gym](https://github.com/openai/gym) environments.  This can be done by running the following command and then following the on-screen printed instructions:
+In order to install coach, there are a few prerequisites required. This will setup all the basics needed to get the user going with running Coach on top of [OpenAI Gym](https://github.com/openai/gym) environments:
 
-```bash
-./install.sh
+```
+# General
+sudo -E apt-get install python3-pip cmake zlib1g-dev python3-tk python-opencv -y
+
+# Boost libraries
+sudo -E apt-get install libboost-all-dev -y
+
+# Scipy requirements
+sudo -E apt-get install libblas-dev liblapack-dev libatlas-base-dev gfortran -y
+
+# PyGame
+sudo -E apt-get install libsdl-dev libsdl-image1.2-dev libsdl-mixer1.2-dev libsdl-ttf2.0-dev
+libsmpeg-dev libportmidi-dev libavformat-dev libswscale-dev -y
+
+# Dashboard
+sudo -E apt-get install dpkg-dev build-essential python3.5-dev libjpeg-dev  libtiff-dev libsdl1.2-dev libnotify-dev 
+freeglut3 freeglut3-dev libsm-dev libgtk2.0-dev libgtk-3-dev libwebkitgtk-dev libgtk-3-dev libwebkitgtk-3.0-dev
+libgstreamer-plugins-base1.0-dev -y
+
+# Gym
+sudo -E apt-get install libav-tools libsdl2-dev swig cmake -y
 ```
 
-Coach creates a virtual environment and installs in it to avoid changes to the user's system.
+We recommend installing coach in a virtualenv:
 
-In order to activate and deactivate Coach's virtual environment:
-
-```bash
-source coach_env/bin/activate
+```
+sudo -E pip3 install virtualenv
+virtualenv -p python3 coach_env
+. coach_env/bin/activate
 ```
 
-```bash
-deactivate
+Finally, install coach using pip:
 ```
+pip3 install rl_coach
+```
+
+Or alternatively, for a development environment, install coach from the cloned repository:
+```
+cd coach
+pip3 install -e .
+```
+
+If a GPU is present, Coach's pip package will install tensorflow-gpu, by default. If a GPU is not present, an [Intel-Optimized TensorFlow](https://software.intel.com/en-us/articles/intel-optimized-tensorflow-wheel-now-available), will be installed. 
 
 In addition to OpenAI Gym, several other environments were tested and are supported. Please follow the instructions in the Supported Environments section below in order to install more environments.
 
-### TensorFlow GPU Support
-
-Coach's installer installs [Intel-Optimized TensorFlow](https://software.intel.com/en-us/articles/intel-optimized-tensorflow-wheel-now-available), which does not support GPU, by default. In order to have Coach running with GPU, a GPU supported TensorFlow version must be installed. This can be done by overriding the TensorFlow version: 
-
-```bash
-pip3 install tensorflow-gpu
-```
-
 ## Usage
 
 ### Running Coach
 
-Coach supports both TensorFlow and neon deep learning frameworks.
-
-Switching between TensorFlow and neon backends is possible by using the `-f` flag.
-
-Using TensorFlow (default): `-f tensorflow`
-
-Using neon: `-f neon`
-
-There are several available presets in presets.py.
+To allow reproducing results in Coach, we defined a mechanism called _preset_. 
+There are several available presets under the `presets` directory.
 To list all the available presets use the `-l` flag.
 
 To run a preset, use:
@@ -103,39 +115,44 @@ python3 coach.py -r -p <preset_name>
 ```
 
 For example:
-1. CartPole environment using Policy Gradients:
+* CartPole environment using Policy Gradients (PG):
 
   ```bash
   python3 coach.py -r -p CartPole_PG
   ```
-
-2. Pendulum using Clipped PPO:
+  
+* Basic level of Doom using Dueling network and Double DQN (DDQN) algorithm:
 
   ```bash
-  python3 coach.py -r -p Pendulum_ClippedPPO -n 8
+  python3 coach.py -r -p Doom_Basic_Dueling_DDQN
   ```
 
-3. MountainCar using A3C:
+Some presets apply to a group of environment levels, like the entire Atari or Mujoco suites for example.
+To use these presets, the requeseted level should be defined using the `-lvl` flag.
+
+For example:
+
+
+* Pong using the Nerual Episodic Control (NEC) algorithm:
 
   ```bash
-  python3 coach.py -r -p MountainCar_A3C -n 8
+  python3 coach.py -r -p Atari_NEC -lvl pong
   ```
 
-4. Doom basic level using Dueling network and Double DQN algorithm:
+There are several types of agents that can benefit from running them in a distrbitued fashion with multiple workers in parallel. Each worker interacts with its own copy of the environment but updates a shared network, which improves the data collection speed and the stability of the learning process.
+To specify the number of workers to run, use the `-n` flag.
+
+For example:
+* Breakout using Asynchronous Advantage Actor-Critic (A3C) with 8 workers:
 
   ```bash
-   python3 coach.py -r -p Doom_Basic_Dueling_DDQN
+  python3 coach.py -r -p Atari_A3C -lvl breakout -n 8
   ```
 
-5. Doom health gathering level using Mixed Monte Carlo:
-
-  ```bash
-  python3 coach.py -r -p Doom_Health_MMC
-  ```
 
 It is easy to create new presets for different levels or environments by following the same pattern as in presets.py
 
-More usage examples can be found [here](http://NervanaSystems.github.io/coach/usage/index.html).
+More usage examples can be found [here](https://nervanasystems.github.io/coach/usage/index.html).
 
 ### Running Coach Dashboard (Visualization)
 Training an agent to solve an environment can be tricky, at times. 
@@ -152,36 +169,14 @@ python3 dashboard.py
 
 
 
-<img src="img/dashboard.png" alt="Coach Design" style="width: 800px;"/>
-
-
-### Parallelizing an Algorithm
-
-Since the introduction of [A3C](https://arxiv.org/abs/1602.01783) in 2016, many algorithms were shown to benefit from running multiple instances in parallel, on many CPU cores. So far, these algorithms include [A3C](https://arxiv.org/abs/1602.01783), [DDPG](https://arxiv.org/pdf/1704.03073.pdf), [PPO](https://arxiv.org/pdf/1707.06347.pdf), and [NAF](https://arxiv.org/pdf/1610.00633.pdf), and this is most probably only the begining. 
-
-Parallelizing an algorithm using Coach is straight-forward. 
-
-The following method of NetworkWrapper parallelizes an algorithm seamlessly:
-
-```python
-network.train_and_sync_networks(current_states, targets)
-```
-
-Once a parallelized run is started, the ```train_and_sync_networks``` API will apply gradients from each local worker's network to the main global network, allowing for parallel training to take place.
-
-Then, it merely requires running Coach with the ``` -n``` flag and with the number of workers to run with. For instance, the following command  will set 16 workers to work together to train a MuJoCo Hopper:
-
-```bash
-python3 coach.py -p Hopper_A3C -n 16
-```
-
+<img src="img/dashboard.gif" alt="Coach Design" style="width: 800px;"/>
 
 
 ## Supported Environments
 
 * *OpenAI Gym:*
 
-    Installed by default by Coach's installer.
+    Installed by default by Coach's installer. The version used by Coach is 0.10.5.
 
 * *ViZDoom:*
 
@@ -189,6 +184,7 @@ python3 coach.py -p Hopper_A3C -n 16
 
     https://github.com/mwydmuch/ViZDoom
 
+    The version currently used by Coach is 1.1.4.
     Additionally, Coach assumes that the environment variable VIZDOOM_ROOT points to the ViZDoom installation directory.
 
 * *Roboschool:*
@@ -211,7 +207,7 @@ python3 coach.py -p Hopper_A3C -n 16
 
 * *CARLA:*
 
-    Download release 0.7 from the CARLA repository -
+    Download release 0.8.4 from the CARLA repository -
 
     https://github.com/carla-simulator/carla/releases
 
@@ -219,6 +215,22 @@ python3 coach.py -p Hopper_A3C -n 16
 
     A simple CARLA settings file (```CarlaSettings.ini```) is supplied with Coach, and is located in the ```environments``` directory.
 
+* *Starcraft:*
+
+    Follow the instructions described in the PySC2 repository - 
+    
+    https://github.com/deepmind/pysc2
+    
+    The version used by Coach is 2.0.1
+    
+* *DeepMind Control Suite:*
+
+    Follow the instructions described in the DeepMind Control Suite repository - 
+    
+    https://github.com/deepmind/dm_control
+    
+    The version used by Coach is 0.0.0
+
 
 ## Supported Algorithms
 
@@ -227,25 +239,47 @@ python3 coach.py -p Hopper_A3C -n 16
 
 
 
-
-* [Deep Q Network (DQN)](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)  ([code](agents/dqn_agent.py))
-* [Double Deep Q Network (DDQN)](https://arxiv.org/pdf/1509.06461.pdf)  ([code](agents/ddqn_agent.py))
+### Value Optimization Agents
+* [Deep Q Network (DQN)](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)  ([code](rl_coach/agents/dqn_agent.py))
+* [Double Deep Q Network (DDQN)](https://arxiv.org/pdf/1509.06461.pdf)  ([code](rl_coach/agents/ddqn_agent.py))
 * [Dueling Q Network](https://arxiv.org/abs/1511.06581)
-* [Mixed Monte Carlo (MMC)](https://arxiv.org/abs/1703.01310)  ([code](agents/mmc_agent.py))
-* [Persistent Advantage Learning (PAL)](https://arxiv.org/abs/1512.04860)  ([code](agents/pal_agent.py))
-* [Categorical Deep Q Network (C51)](https://arxiv.org/abs/1707.06887)  ([code](agents/categorical_dqn_agent.py))
-* [Quantile Regression Deep Q Network (QR-DQN)](https://arxiv.org/pdf/1710.10044v1.pdf)  ([code](agents/qr_dqn_agent.py))
-* [Bootstrapped Deep Q Network](https://arxiv.org/abs/1602.04621)  ([code](agents/bootstrapped_dqn_agent.py))
-* [N-Step Q Learning](https://arxiv.org/abs/1602.01783) | **Distributed**  ([code](agents/n_step_q_agent.py))
-* [Neural Episodic Control (NEC)](https://arxiv.org/abs/1703.01988)  ([code](agents/nec_agent.py))
-* [Normalized Advantage Functions (NAF)](https://arxiv.org/abs/1603.00748.pdf) | **Distributed**  ([code](agents/naf_agent.py))
-* [Policy Gradients (PG)](http://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf) | **Distributed**  ([code](agents/policy_gradients_agent.py))
-* [Asynchronous Advantage Actor-Critic (A3C)](https://arxiv.org/abs/1602.01783) | **Distributed**  ([code](agents/actor_critic_agent.py))
-* [Deep Deterministic Policy Gradients (DDPG)](https://arxiv.org/abs/1509.02971) | **Distributed**  ([code](agents/ddpg_agent.py))
-* [Proximal Policy Optimization (PPO)](https://arxiv.org/pdf/1707.06347.pdf)  ([code](agents/ppo_agent.py))
-* [Clipped Proximal Policy Optimization](https://arxiv.org/pdf/1707.06347.pdf) | **Distributed**  ([code](agents/clipped_ppo_agent.py))
-* [Direct Future Prediction (DFP)](https://arxiv.org/abs/1611.01779) | **Distributed**  ([code](agents/dfp_agent.py))
-* Behavioral Cloning (BC)  ([code](agents/bc_agent.py))
+* [Mixed Monte Carlo (MMC)](https://arxiv.org/abs/1703.01310)  ([code](rl_coach/agents/mmc_agent.py))
+* [Persistent Advantage Learning (PAL)](https://arxiv.org/abs/1512.04860)  ([code](rl_coach/agents/pal_agent.py))
+* [Categorical Deep Q Network (C51)](https://arxiv.org/abs/1707.06887)  ([code](rl_coach/agents/categorical_dqn_agent.py))
+* [Quantile Regression Deep Q Network (QR-DQN)](https://arxiv.org/pdf/1710.10044v1.pdf)  ([code](rl_coach/agents/qr_dqn_agent.py))
+* [N-Step Q Learning](https://arxiv.org/abs/1602.01783) | **Distributed**  ([code](rl_coach/agents/n_step_q_agent.py))
+* [Neural Episodic Control (NEC)](https://arxiv.org/abs/1703.01988)  ([code](rl_coach/agents/nec_agent.py))
+* [Normalized Advantage Functions (NAF)](https://arxiv.org/abs/1603.00748.pdf) | **Distributed**  ([code](rl_coach/agents/naf_agent.py))
+
+### Policy Optimization Agents
+* [Policy Gradients (PG)](http://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf) | **Distributed**  ([code](rl_coach/agents/policy_gradients_agent.py))
+* [Asynchronous Advantage Actor-Critic (A3C)](https://arxiv.org/abs/1602.01783) | **Distributed**  ([code](rl_coach/agents/actor_critic_agent.py))
+* [Deep Deterministic Policy Gradients (DDPG)](https://arxiv.org/abs/1509.02971) | **Distributed**  ([code](rl_coach/agents/ddpg_agent.py))
+* [Proximal Policy Optimization (PPO)](https://arxiv.org/pdf/1707.06347.pdf)  ([code](rl_coach/agents/ppo_agent.py))
+* [Clipped Proximal Policy Optimization (CPPO)](https://arxiv.org/pdf/1707.06347.pdf) | **Distributed**  ([code](rl_coach/agents/clipped_ppo_agent.py))
+* [Generalized Advantage Estimation (GAE)](https://arxiv.org/abs/1506.02438) ([code](rl_coach/agents/actor_critic_agent.py#L86))
+
+### General Agents
+* [Direct Future Prediction (DFP)](https://arxiv.org/abs/1611.01779) | **Distributed**  ([code](rl_coach/agents/dfp_agent.py))
+
+### Imitation Learning Agents
+* Behavioral Cloning (BC)  ([code](rl_coach/agents/bc_agent.py))
+
+### Hierarchical Reinforcement Learning Agents
+* [Hierarchical Actor Critic (HAC)](https://arxiv.org/abs/1712.00948.pdf) ([code](rl_coach/agents/ddpg_hac_agent.py))
+
+### Memory Types
+* [Hindsight Experience Replay (HER)](https://arxiv.org/abs/1707.01495.pdf) ([code](rl_coach/memories/episodic/episodic_hindsight_experience_replay.py))
+* [Prioritized Experience Replay (PER)](https://arxiv.org/abs/1511.05952) ([code](rl_coach/memories/non_episodic/prioritized_experience_replay.py))
+
+### Exploration Techniques
+* E-Greedy ([code](rl_coach/exploration_policies/e_greedy.py))
+* Boltzmann ([code](rl_coach/exploration_policies/boltzmann.py))
+* Ornstein–Uhlenbeck process ([code](rl_coach/exploration_policies/ou_process.py))
+* Normal Noise ([code](rl_coach/exploration_policies/additive_noise.py))
+* Truncated Normal Noise ([code](rl_coach/exploration_policies/truncated_normal.py))
+* [Bootstrapped Deep Q Network](https://arxiv.org/abs/1602.04621)  ([code](rl_coach/agents/bootstrapped_dqn_agent.py))
+* [UCB Exploration via Q-Ensembles (UCB)](https://arxiv.org/abs/1706.01502) ([code](rl_coach/exploration_policies/ucb.py))
 
 
 ## Citation
diff --git a/agents/__init__.py b/agents/__init__.py
deleted file mode 100644
index fdbd13e..0000000
--- a/agents/__init__.py
+++ /dev/null
@@ -1,38 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.actor_critic_agent import *
-from agents.agent import *
-from agents.bc_agent import *
-from agents.bootstrapped_dqn_agent import *
-from agents.clipped_ppo_agent import *
-from agents.ddpg_agent import *
-from agents.ddqn_agent import *
-from agents.dfp_agent import *
-from agents.dqn_agent import *
-from agents.categorical_dqn_agent import *
-from agents.human_agent import *
-from agents.imitation_agent import *
-from agents.mmc_agent import *
-from agents.n_step_q_agent import *
-from agents.naf_agent import *
-from agents.nec_agent import *
-from agents.pal_agent import *
-from agents.policy_gradients_agent import *
-from agents.policy_optimization_agent import *
-from agents.ppo_agent import *
-from agents.value_optimization_agent import *
-from agents.qr_dqn_agent import *
diff --git a/agents/actor_critic_agent.py b/agents/actor_critic_agent.py
deleted file mode 100644
index 729e67f..0000000
--- a/agents/actor_critic_agent.py
+++ /dev/null
@@ -1,146 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.policy_optimization_agent import *
-from logger import *
-from utils import *
-import scipy.signal
-
-
-# Actor Critic - https://arxiv.org/abs/1602.01783
-class ActorCriticAgent(PolicyOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0, create_target_network = False):
-        PolicyOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id, create_target_network)
-        self.last_gradient_update_step_idx = 0
-        self.action_advantages = Signal('Advantages')
-        self.state_values = Signal('Values')
-        self.unclipped_grads = Signal('Grads (unclipped)')
-        self.value_loss = Signal('Value Loss')
-        self.policy_loss = Signal('Policy Loss')
-        self.signals.append(self.action_advantages)
-        self.signals.append(self.state_values)
-        self.signals.append(self.unclipped_grads)
-        self.signals.append(self.value_loss)
-        self.signals.append(self.policy_loss)
-
-    # Discounting function used to calculate discounted returns.
-    def discount(self, x, gamma):
-        return scipy.signal.lfilter([1], [1, -gamma], x[::-1], axis=0)[::-1]
-
-    def get_general_advantage_estimation_values(self, rewards, values):
-        # values contain n+1 elements (t ... t+n+1), rewards contain n elements (t ... t + n)
-        bootstrap_extended_rewards = np.array(rewards.tolist() + [values[-1]])
-
-        # Approximation based calculation of GAE (mathematically correct only when Tmax = inf,
-        # although in practice works even in much smaller Tmax values, e.g. 20)
-        deltas = rewards + self.tp.agent.discount * values[1:] - values[:-1]
-        gae = self.discount(deltas, self.tp.agent.discount * self.tp.agent.gae_lambda)
-
-        if self.tp.agent.estimate_value_using_gae:
-            discounted_returns = np.expand_dims(gae + values[:-1], -1)
-        else:
-            discounted_returns = np.expand_dims(np.array(self.discount(bootstrap_extended_rewards,
-                                                                       self.tp.agent.discount)), 1)[:-1]
-        return gae, discounted_returns
-
-    def learn_from_batch(self, batch):
-        # batch contains a list of episodes to learn from
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        # get the values for the current states
-        result = self.main_network.online_network.predict(current_states)
-        current_state_values = result[0]
-        self.state_values.add_sample(current_state_values)
-
-        # the targets for the state value estimator
-        num_transitions = len(game_overs)
-        state_value_head_targets = np.zeros((num_transitions, 1))
-
-        # estimate the advantage function
-        action_advantages = np.zeros((num_transitions, 1))
-
-        if self.policy_gradient_rescaler == PolicyGradientRescaler.A_VALUE:
-            if game_overs[-1]:
-                R = 0
-            else:
-                R = self.main_network.online_network.predict(last_sample(next_states))[0]
-
-            for i in reversed(range(num_transitions)):
-                R = rewards[i] + self.tp.agent.discount * R
-                state_value_head_targets[i] = R
-                action_advantages[i] = R - current_state_values[i]
-
-        elif self.policy_gradient_rescaler == PolicyGradientRescaler.GAE:
-            # get bootstraps
-            bootstrapped_value = self.main_network.online_network.predict(last_sample(next_states))[0]
-            values = np.append(current_state_values, bootstrapped_value)
-            if game_overs[-1]:
-                values[-1] = 0
-
-            # get general discounted returns table
-            gae_values, state_value_head_targets = self.get_general_advantage_estimation_values(rewards, values)
-            action_advantages = np.vstack(gae_values)
-        else:
-            screen.warning("WARNING: The requested policy gradient rescaler is not available")
-
-        action_advantages = action_advantages.squeeze(axis=-1)
-        if not self.env.discrete_controls and len(actions.shape) < 2:
-            actions = np.expand_dims(actions, -1)
-
-        # train
-        result = self.main_network.online_network.accumulate_gradients({**current_states, 'output_1_0': actions},
-                                                                       [state_value_head_targets, action_advantages])
-
-        # logging
-        total_loss, losses, unclipped_grads = result[:3]
-        self.action_advantages.add_sample(action_advantages)
-        self.unclipped_grads.add_sample(unclipped_grads)
-        self.value_loss.add_sample(losses[0])
-        self.policy_loss.add_sample(losses[1])
-
-        return total_loss
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        # TODO: rename curr_state -> state
-
-        # convert to batch so we can run it through the network
-        curr_state = {
-            k: np.expand_dims(np.array(curr_state[k]), 0)
-            for k in curr_state.keys()
-        }
-
-        if self.env.discrete_controls:
-            # DISCRETE
-            state_value, action_probabilities = self.main_network.online_network.predict(curr_state)
-            action_probabilities = action_probabilities.squeeze()
-            if phase == RunPhase.TRAIN:
-                action = self.exploration_policy.get_action(action_probabilities)
-            else:
-                action = np.argmax(action_probabilities)
-            action_info = {"action_probability": action_probabilities[action], "state_value": state_value}
-            self.entropy.add_sample(-np.sum(action_probabilities * np.log(action_probabilities + eps)))
-        else:
-            # CONTINUOUS
-            state_value, action_values_mean, action_values_std = self.main_network.online_network.predict(curr_state)
-            action_values_mean = action_values_mean.squeeze()
-            action_values_std = action_values_std.squeeze()
-            if phase == RunPhase.TRAIN:
-                action = np.squeeze(np.random.randn(1, self.action_space_size) * action_values_std + action_values_mean)
-            else:
-                action = action_values_mean
-            action_info = {"action_probability": action, "state_value": state_value}
-
-        return action, action_info
diff --git a/agents/agent.py b/agents/agent.py
deleted file mode 100644
index 888f1b2..0000000
--- a/agents/agent.py
+++ /dev/null
@@ -1,580 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import scipy.ndimage
-try:
-    import matplotlib.pyplot as plt
-except:
-    from logger import failed_imports
-    failed_imports.append("matplotlib")
-
-import copy
-from renderer import Renderer
-from configurations import Preset
-from collections import deque
-from utils import LazyStack
-from collections import OrderedDict
-from utils import RunPhase, Signal, is_empty, RunningStat
-from architectures import *
-from exploration_policies import *
-from memories import *
-from memories.memory import *
-from logger import logger, screen
-import random
-import time
-import os
-import itertools
-from architectures.tensorflow_components.shared_variables import SharedRunningStats
-from six.moves import range
-
-
-class Agent(object):
-    def __init__(self, env, tuning_parameters, replicated_device=None, task_id=0):
-        """
-        :param env: An environment instance
-        :type env: EnvironmentWrapper
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        :param replicated_device: A tensorflow device for distributed training (optional)
-        :type replicated_device: instancemethod
-        :param thread_id: The current thread id
-        :param thread_id: int
-        """
-
-        screen.log_title("Creating agent {}".format(task_id))
-        self.task_id = task_id
-        self.sess = tuning_parameters.sess
-        self.env = tuning_parameters.env_instance = env
-        self.imitation = False
-
-        # i/o dimensions
-        if not tuning_parameters.env.desired_observation_width or not tuning_parameters.env.desired_observation_height:
-            tuning_parameters.env.desired_observation_width = self.env.width
-            tuning_parameters.env.desired_observation_height = self.env.height
-        self.action_space_size = tuning_parameters.env.action_space_size = self.env.action_space_size
-        self.measurements_size = tuning_parameters.env.measurements_size = self.env.measurements_size
-        if tuning_parameters.agent.use_accumulated_reward_as_measurement:
-            self.measurements_size = tuning_parameters.env.measurements_size = (self.measurements_size[0] + 1,)
-
-        # modules
-        if tuning_parameters.agent.load_memory_from_file_path:
-            screen.log_title("Loading replay buffer from pickle. Pickle path: {}"
-                             .format(tuning_parameters.agent.load_memory_from_file_path))
-            self.memory = read_pickle(tuning_parameters.agent.load_memory_from_file_path)
-        else:
-            self.memory = eval(tuning_parameters.memory + '(tuning_parameters)')
-        # self.architecture = eval(tuning_parameters.architecture)
-
-        self.has_global = replicated_device is not None
-        self.replicated_device = replicated_device
-        self.worker_device = "/job:worker/task:{}/cpu:0".format(task_id) if replicated_device is not None else "/gpu:0"
-
-        self.exploration_policy = eval(tuning_parameters.exploration.policy + '(tuning_parameters)')
-        self.evaluation_exploration_policy = eval(tuning_parameters.exploration.evaluation_policy
-                                                  + '(tuning_parameters)')
-        self.evaluation_exploration_policy.change_phase(RunPhase.TEST)
-
-        # initialize all internal variables
-        self.tp = tuning_parameters
-        self.in_heatup = False
-        self.total_reward_in_current_episode = 0
-        self.total_steps_counter = 0
-        self.running_reward = None
-        self.training_iteration = 0
-        self.current_episode = self.tp.current_episode = 0
-        self.curr_state = {}
-        self.current_episode_steps_counter = 0
-        self.episode_running_info = {}
-        self.last_episode_evaluation_ran = 0
-        self.running_observations = []
-        logger.set_current_time(self.current_episode)
-        self.main_network = None
-        self.networks = []
-        self.last_episode_images = []
-        self.renderer = Renderer()
-
-        # signals
-        self.signals = []
-        self.loss = Signal('Loss')
-        self.signals.append(self.loss)
-        self.curr_learning_rate = Signal('Learning Rate')
-        self.signals.append(self.curr_learning_rate)
-
-        if self.tp.env.normalize_observation and not self.env.is_state_type_image:
-            if not self.tp.distributed or not self.tp.agent.share_statistics_between_workers:
-                self.running_observation_stats = RunningStat((self.tp.env.desired_observation_width,))
-                self.running_reward_stats = RunningStat(())
-                if self.tp.checkpoint_restore_dir:
-                    checkpoint_path = os.path.join(self.tp.checkpoint_restore_dir, "running_stats.p")
-                    self.running_observation_stats = read_pickle(checkpoint_path)
-                else:
-                    self.running_observation_stats = RunningStat((self.tp.env.desired_observation_width,))
-                    self.running_reward_stats = RunningStat(())
-            else:
-                self.running_observation_stats = SharedRunningStats(self.tp, replicated_device,
-                                                                    shape=(self.tp.env.desired_observation_width,),
-                                                                    name='observation_stats')
-                self.running_reward_stats = SharedRunningStats(self.tp, replicated_device,
-                                                               shape=(),
-                                                               name='reward_stats')
-
-        # env is already reset at this point. Otherwise we're getting an error where you cannot
-        # reset an env which is not done
-        self.reset_game(do_not_reset_env=True)
-
-        # use seed
-        if self.tp.seed is not None:
-            random.seed(self.tp.seed)
-            np.random.seed(self.tp.seed)
-
-    def log_to_screen(self, phase):
-        # log to screen
-        if self.current_episode >= 0:
-            if phase == RunPhase.TRAIN:
-                exploration = self.exploration_policy.get_control_param()
-            else:
-                exploration = self.evaluation_exploration_policy.get_control_param()
-
-            screen.log_dict(
-                OrderedDict([
-                    ("Worker", self.task_id),
-                    ("Episode", self.current_episode),
-                    ("total reward", self.total_reward_in_current_episode),
-                    ("exploration", exploration),
-                    ("steps", self.total_steps_counter),
-                    ("training iteration", self.training_iteration)
-                ]),
-                prefix=phase
-            )
-
-    def update_log(self, phase=RunPhase.TRAIN):
-        """
-        Writes logging messages to screen and updates the log file with all the signal values.
-        :return: None
-        """
-        # log all the signals to file
-        logger.set_current_time(self.current_episode)
-        logger.create_signal_value('Training Iter', self.training_iteration)
-        logger.create_signal_value('In Heatup', int(phase == RunPhase.HEATUP))
-        logger.create_signal_value('ER #Transitions', self.memory.num_transitions())
-        logger.create_signal_value('ER #Episodes', self.memory.length())
-        logger.create_signal_value('Episode Length', self.current_episode_steps_counter)
-        logger.create_signal_value('Total steps', self.total_steps_counter)
-        logger.create_signal_value("Epsilon", self.exploration_policy.get_control_param())
-        logger.create_signal_value("Training Reward", self.total_reward_in_current_episode
-                                   if phase == RunPhase.TRAIN else np.nan)
-        logger.create_signal_value('Evaluation Reward', self.total_reward_in_current_episode
-                                   if phase == RunPhase.TEST else np.nan)
-        logger.create_signal_value('Update Target Network', 0, overwrite=False)
-        logger.update_wall_clock_time(self.current_episode)
-
-        for signal in self.signals:
-            logger.create_signal_value("{}/Mean".format(signal.name), signal.get_mean())
-            logger.create_signal_value("{}/Stdev".format(signal.name), signal.get_stdev())
-            logger.create_signal_value("{}/Max".format(signal.name), signal.get_max())
-            logger.create_signal_value("{}/Min".format(signal.name), signal.get_min())
-
-        # dump
-        if self.current_episode % self.tp.visualization.dump_signals_to_csv_every_x_episodes == 0 \
-                and self.current_episode > 0:
-            logger.dump_output_csv()
-
-    def reset_game(self, do_not_reset_env=False):
-        """
-        Resets all the episodic parameters and start a new environment episode.
-        :param do_not_reset_env: A boolean that allows prevention of environment reset
-        :return: None
-        """
-
-        for signal in self.signals:
-            signal.reset()
-        self.total_reward_in_current_episode = 0
-        self.curr_state = {}
-        self.last_episode_images = []
-        self.current_episode_steps_counter = 0
-        self.episode_running_info = {}
-        if not do_not_reset_env:
-            self.env.reset()
-        self.exploration_policy.reset()
-
-        # required for online plotting
-        if self.tp.visualization.plot_action_values_online:
-            if hasattr(self, 'episode_running_info') and hasattr(self.env, 'actions_description'):
-                for action in self.env.actions_description:
-                    self.episode_running_info[action] = []
-            plt.clf()
-
-        if self.tp.agent.middleware_type == MiddlewareTypes.LSTM:
-            for network in self.networks:
-                network.online_network.curr_rnn_c_in = network.online_network.middleware_embedder.c_init
-                network.online_network.curr_rnn_h_in = network.online_network.middleware_embedder.h_init
-
-        self.prepare_initial_state()
-
-    def preprocess_observation(self, observation):
-        """
-        Preprocesses the given observation.
-        For images - convert to grayscale, resize and convert to int.
-        For measurements vectors - normalize by a running average and std.
-        :param observation: The agents observation
-        :return: A processed version of the observation
-        """
-
-        if self.env.is_state_type_image:
-            # rescale
-            observation = scipy.misc.imresize(observation,
-                                              (self.tp.env.desired_observation_height,
-                                               self.tp.env.desired_observation_width),
-                                              interp=self.tp.rescaling_interpolation_type)
-            # rgb to y
-            if len(observation.shape) > 2 and observation.shape[2] > 1:
-                r, g, b = observation[:, :, 0], observation[:, :, 1], observation[:, :, 2]
-                observation = 0.2989 * r + 0.5870 * g + 0.1140 * b
-
-            # Render the processed observation which is how the agent will see it
-            # Warning: this cannot currently be done in parallel to rendering the environment
-            if self.tp.visualization.render_observation:
-                if not self.renderer.is_open:
-                    self.renderer.create_screen(observation.shape[0], observation.shape[1])
-                self.renderer.render_image(observation)
-
-            return observation.astype('uint8')
-        else:
-            if self.tp.env.normalize_observation and self.sess is not None:
-                # standardize the input observation using a running mean and std
-                if not self.tp.distributed or not self.tp.agent.share_statistics_between_workers:
-                    self.running_observation_stats.push(observation)
-                observation = (observation - self.running_observation_stats.mean) / \
-                              (self.running_observation_stats.std + 1e-15)
-                observation = np.clip(observation, -5.0, 5.0)
-            return observation
-
-    def learn_from_batch(self, batch):
-        """
-        Given a batch of transitions, calculates their target values and updates the network.
-        :param batch: A list of transitions
-        :return: The loss of the training
-        """
-        pass
-
-    def train(self):
-        """
-        A single training iteration. Sample a batch, train on it and update target networks.
-        :return: The training loss.
-        """
-        batch = self.memory.sample(self.tp.batch_size)
-        loss = self.learn_from_batch(batch)
-
-        if self.tp.learning_rate_decay_rate != 0:
-            self.curr_learning_rate.add_sample(self.tp.sess.run(self.tp.learning_rate))
-        else:
-            self.curr_learning_rate.add_sample(self.tp.learning_rate)
-
-        # update the target network of every network that has a target network
-        if self.total_steps_counter % self.tp.agent.num_steps_between_copying_online_weights_to_target == 0:
-            for network in self.networks:
-                network.update_target_network(self.tp.agent.rate_for_copying_weights_to_target)
-            logger.create_signal_value('Update Target Network', 1)
-        else:
-            logger.create_signal_value('Update Target Network', 0, overwrite=False)
-
-        return loss
-
-    def extract_batch(self, batch):
-        """
-        Extracts a single numpy array for each object in a batch of transitions (state, action, etc.)
-        :param batch: An array of transitions
-        :return: For each transition element, returns a numpy array of all the transitions in the batch
-        """
-        current_states = {}
-        next_states = {}
-        current_states['observation'] = np.array([np.array(transition.state['observation']) for transition in batch])
-        next_states['observation'] = np.array([np.array(transition.next_state['observation']) for transition in batch])
-        actions = np.array([transition.action for transition in batch])
-        rewards = np.array([transition.reward for transition in batch])
-        game_overs = np.array([transition.game_over for transition in batch])
-        total_return = np.array([transition.total_return for transition in batch])
-
-        # get the entire state including measurements if available
-        if self.tp.agent.use_measurements:
-            current_states['measurements'] = np.array([transition.state['measurements'] for transition in batch])
-            next_states['measurements'] = np.array([transition.next_state['measurements'] for transition in batch])
-
-        return current_states, next_states, actions, rewards, game_overs, total_return
-
-    def plot_action_values_online(self):
-        """
-        Plot an animated graph of the value of each possible action during the episode
-        :return: None
-        """
-
-        plt.clf()
-        for key, data_list in self.episode_running_info.items():
-            plt.plot(data_list, label=key)
-        plt.legend()
-        plt.pause(0.00000001)
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        """
-        choose an action to act with in the current episode being played. Different behavior might be exhibited when training
-         or testing.
-
-        :param curr_state: the current state to act upon.
-        :param phase: the current phase: training or testing.
-        :return: chosen action, some action value describing the action (q-value, probability, etc)
-        """
-        pass
-
-    def preprocess_reward(self, reward):
-        if self.tp.env.reward_scaling:
-            reward /= float(self.tp.env.reward_scaling)
-        if self.tp.env.reward_clipping_max:
-            reward = min(reward, self.tp.env.reward_clipping_max)
-        if self.tp.env.reward_clipping_min:
-            reward = max(reward, self.tp.env.reward_clipping_min)
-        return reward
-
-    def tf_input_state(self, curr_state):
-        """
-        convert curr_state into input tensors tensorflow is expecting.
-        """
-        # add batch axis with length 1 onto each value
-        # extract values from the state based on agent.input_types
-        input_state = {}
-        for input_name in self.tp.agent.input_types.keys():
-            input_state[input_name] = np.expand_dims(np.array(curr_state[input_name]), 0)
-        return input_state
-        
-    def prepare_initial_state(self):
-        """
-        Create an initial state when starting a new episode
-        :return: None
-        """
-        observation = self.preprocess_observation(self.env.state['observation'])
-        self.curr_stack = deque([observation]*self.tp.env.observation_stack_size, maxlen=self.tp.env.observation_stack_size)
-        observation = LazyStack(self.curr_stack, -1)
-
-        self.curr_state = {
-            'observation': observation
-        }
-        if self.tp.agent.use_measurements:
-            if 'measurements' in self.env.state.keys():
-                self.curr_state['measurements'] = self.env.state['measurements']
-            else:
-                self.curr_state['measurements'] = np.zeros(0)
-            if self.tp.agent.use_accumulated_reward_as_measurement:
-                self.curr_state['measurements'] = np.append(self.curr_state['measurements'], 0)
-
-    def act(self, phase=RunPhase.TRAIN):
-        """
-        Take one step in the environment according to the network prediction and store the transition in memory
-        :param phase: Either Train or Test to specify if greedy actions should be used and if transitions should be stored
-        :return: A boolean value that signals an episode termination
-        """
-
-        if phase != RunPhase.TEST:
-            self.total_steps_counter += 1
-        self.current_episode_steps_counter += 1
-
-        # get new action
-        action_info = {"action_probability": 1.0 / self.env.action_space_size, "action_value": 0, "max_action_value": 0}
-
-        if phase == RunPhase.HEATUP and not self.tp.heatup_using_network_decisions:
-            action = self.env.get_random_action()
-        else:
-            action, action_info = self.choose_action(self.curr_state, phase=phase)
-
-        # perform action
-        if type(action) == np.ndarray:
-            action = action.squeeze()
-        result = self.env.step(action)
-
-        shaped_reward = self.preprocess_reward(result['reward'])
-        if 'action_intrinsic_reward' in action_info.keys():
-            shaped_reward += action_info['action_intrinsic_reward']
-        # TODO: should total_reward_in_current_episode include shaped_reward?
-        self.total_reward_in_current_episode += result['reward']
-        next_state = copy.copy(result['state'])
-        next_state['observation'] = self.preprocess_observation(next_state['observation'])
-
-        # plot action values online
-        if self.tp.visualization.plot_action_values_online and phase != RunPhase.HEATUP:
-            self.plot_action_values_online()
-
-        # initialize the next state
-        # TODO: provide option to stack more than just the observation
-        self.curr_stack.append(next_state['observation'])
-        observation = LazyStack(self.curr_stack, -1)
-
-        next_state['observation'] = observation
-        if self.tp.agent.use_measurements:
-            if 'measurements' in result['state'].keys():
-                next_state['measurements'] = result['state']['measurements']
-            else:
-                next_state['measurements'] = np.zeros(0)
-            if self.tp.agent.use_accumulated_reward_as_measurement:
-                next_state['measurements'] = np.append(next_state['measurements'], self.total_reward_in_current_episode)
-
-        # store the transition only if we are training
-        if phase == RunPhase.TRAIN or phase == RunPhase.HEATUP:
-            transition = Transition(self.curr_state, result['action'], shaped_reward, next_state, result['done'])
-            for key in action_info.keys():
-                transition.info[key] = action_info[key]
-            if self.tp.agent.add_a_normalized_timestep_to_the_observation:
-                transition.info['timestep'] = float(self.current_episode_steps_counter) / self.env.timestep_limit
-            self.memory.store(transition)
-        elif phase == RunPhase.TEST and self.tp.visualization.dump_gifs:
-            # we store the transitions only for saving gifs
-            self.last_episode_images.append(self.env.get_rendered_image())
-
-        # update the current state for the next step
-        self.curr_state = next_state
-
-        # deal with episode termination
-        if result['done']:
-            if self.tp.visualization.dump_csv:
-                self.update_log(phase=phase)
-            self.log_to_screen(phase=phase)
-
-            if phase == RunPhase.TRAIN or phase == RunPhase.HEATUP:
-                self.reset_game()
-
-            self.current_episode += 1
-            self.tp.current_episode = self.current_episode
-
-        # return episode really ended
-        return result['done']
-
-    def evaluate(self, num_episodes, keep_networks_synced=False):
-        """
-        Run in an evaluation mode for several episodes. Actions will be chosen greedily.
-        :param keep_networks_synced: keep the online network in sync with the global network after every episode
-        :param num_episodes: The number of episodes to evaluate on
-        :return: None
-        """
-
-        max_reward_achieved = -float('inf')
-        average_evaluation_reward = 0
-        screen.log_title("Running evaluation")
-        self.env.change_phase(RunPhase.TEST)
-        for i in range(num_episodes):
-            # keep the online network in sync with the global network
-            if keep_networks_synced:
-                for network in self.networks:
-                    network.sync()
-
-            episode_ended = False
-            while not episode_ended:
-                episode_ended = self.act(phase=RunPhase.TEST)
-
-                if keep_networks_synced \
-                   and self.total_steps_counter % self.tp.agent.update_evaluation_agent_network_after_every_num_steps:
-                    for network in self.networks:
-                        network.sync()
-
-            if self.total_reward_in_current_episode > max_reward_achieved:
-                max_reward_achieved = self.total_reward_in_current_episode
-                frame_skipping = int(5/self.tp.env.frame_skip)
-                if self.tp.visualization.dump_gifs:
-                    logger.create_gif(self.last_episode_images[::frame_skipping],
-                                      name='score-{}'.format(max_reward_achieved), fps=10)
-
-            average_evaluation_reward += self.total_reward_in_current_episode
-            self.reset_game()
-
-        average_evaluation_reward /= float(num_episodes)
-
-        self.env.change_phase(RunPhase.TRAIN)
-        screen.log_title("Evaluation done. Average reward = {}.".format(average_evaluation_reward))
-
-    def post_training_commands(self):
-        pass
-
-    def improve(self):
-        """
-        Training algorithms wrapper. Heatup >> [ Evaluate >> Play >> Train >> Save checkpoint ]
-
-        :return: None
-        """
-
-        # synchronize the online network weights with the global network
-        for network in self.networks:
-            network.sync()
-
-        # heatup phase
-        if self.tp.num_heatup_steps != 0:
-            self.in_heatup = True
-            screen.log_title("Starting heatup {}".format(self.task_id))
-            num_steps_required_for_one_training_batch = self.tp.batch_size * self.tp.env.observation_stack_size
-            for step in range(max(self.tp.num_heatup_steps, num_steps_required_for_one_training_batch)):
-                self.act(phase=RunPhase.HEATUP)
-
-        # training phase
-        self.in_heatup = False
-        screen.log_title("Starting training {}".format(self.task_id))
-        self.exploration_policy.change_phase(RunPhase.TRAIN)
-        training_start_time = time.time()
-        model_snapshots_periods_passed = -1
-        self.reset_game()
-
-        while self.training_iteration < self.tp.num_training_iterations:
-            # evaluate
-            evaluate_agent = (self.last_episode_evaluation_ran is not self.current_episode) and \
-                             (self.current_episode % self.tp.evaluate_every_x_episodes == 0)
-            evaluate_agent = evaluate_agent or \
-                             (self.imitation and self.training_iteration > 0 and
-                              self.training_iteration % self.tp.evaluate_every_x_training_iterations == 0)
-
-            if evaluate_agent:
-                self.env.reset(force_environment_reset=True)
-                self.last_episode_evaluation_ran = self.current_episode
-                self.evaluate(self.tp.evaluation_episodes)
-
-            # snapshot model
-            if self.tp.save_model_sec and self.tp.save_model_sec > 0 and not self.tp.distributed:
-                total_training_time = time.time() - training_start_time
-                current_snapshot_period = (int(total_training_time) // self.tp.save_model_sec)
-                if current_snapshot_period > model_snapshots_periods_passed:
-                    model_snapshots_periods_passed = current_snapshot_period
-                    self.save_model(model_snapshots_periods_passed)
-                    if hasattr(self, 'running_observation_state') and self.running_observation_stats is not None:
-                        to_pickle(self.running_observation_stats,
-                                  os.path.join(self.tp.save_model_dir,
-                                               "running_stats.p".format(model_snapshots_periods_passed)))
-
-            # play and record in replay buffer
-            if self.tp.agent.collect_new_data:
-                if self.tp.agent.step_until_collecting_full_episodes:
-                    step = 0
-                    while step < self.tp.agent.num_consecutive_playing_steps or self.memory.get_episode(-1).length() != 0:
-                        self.act()
-                        step += 1
-                else:
-                    for step in range(self.tp.agent.num_consecutive_playing_steps):
-                        self.act()
-
-            # train
-            if self.tp.train:
-                for step in range(self.tp.agent.num_consecutive_training_steps):
-                    loss = self.train()
-                    self.loss.add_sample(loss)
-                    self.training_iteration += 1
-                    if self.imitation:
-                        self.log_to_screen(RunPhase.TRAIN)
-                self.post_training_commands()
-
-    def save_model(self, model_id):
-        self.main_network.save_model(model_id)
diff --git a/agents/bc_agent.py b/agents/bc_agent.py
deleted file mode 100644
index 70fe3e6..0000000
--- a/agents/bc_agent.py
+++ /dev/null
@@ -1,39 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import numpy as np
-
-from agents.imitation_agent import ImitationAgent
-
-
-# Behavioral Cloning Agent
-class BCAgent(ImitationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ImitationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-
-    def learn_from_batch(self, batch):
-        current_states, _, actions, _, _, _ = self.extract_batch(batch)
-
-        # the targets for the network are the actions since this is supervised learning
-        if self.env.discrete_controls:
-            targets = np.eye(self.env.action_space_size)[[actions]]
-        else:
-            targets = actions
-
-        result = self.main_network.train_and_sync_networks(current_states, targets)
-        total_loss = result[0]
-
-        return total_loss
diff --git a/agents/bootstrapped_dqn_agent.py b/agents/bootstrapped_dqn_agent.py
deleted file mode 100644
index 3476022..0000000
--- a/agents/bootstrapped_dqn_agent.py
+++ /dev/null
@@ -1,58 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.value_optimization_agent import *
-
-
-# Bootstrapped DQN - https://arxiv.org/pdf/1602.04621.pdf
-class BootstrappedDQNAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-
-    def reset_game(self, do_not_reset_env=False):
-        ValueOptimizationAgent.reset_game(self, do_not_reset_env)
-        self.exploration_policy.select_head()
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        # for the action we actually took, the error is:
-        # TD error = r + discount*max(q_st_plus_1) - q_st
-        # for all other actions, the error is 0
-        q_st_plus_1 = self.main_network.target_network.predict(next_states)
-        # initialize with the current prediction so that we will
-        TD_targets = self.main_network.online_network.predict(current_states)
-
-        #  only update the action that we have actually done in this transition
-        for i in range(self.tp.batch_size):
-            mask = batch[i].info['mask']
-            for head_idx in range(self.tp.exploration.architecture_num_q_heads):
-                if mask[head_idx] == 1:
-                    TD_targets[head_idx][i, actions[i]] = rewards[i] + \
-                                                          (1.0 - game_overs[i]) * self.tp.agent.discount * np.max(
-                                                              q_st_plus_1[head_idx][i], 0)
-
-        result = self.main_network.train_and_sync_networks(current_states, TD_targets)
-
-        total_loss = result[0]
-
-        return total_loss
-
-    def act(self, phase=RunPhase.TRAIN):
-        ValueOptimizationAgent.act(self, phase)
-        mask = np.random.binomial(1, self.tp.exploration.bootstrapped_data_sharing_probability,
-                                  self.tp.exploration.architecture_num_q_heads)
-        self.memory.update_last_transition_info({'mask': mask})
diff --git a/agents/categorical_dqn_agent.py b/agents/categorical_dqn_agent.py
deleted file mode 100644
index dec8ba2..0000000
--- a/agents/categorical_dqn_agent.py
+++ /dev/null
@@ -1,60 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.value_optimization_agent import *
-
-
-# Categorical Deep Q Network - https://arxiv.org/pdf/1707.06887.pdf
-class CategoricalDQNAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.z_values = np.linspace(self.tp.agent.v_min, self.tp.agent.v_max, self.tp.agent.atoms)
-
-    # prediction's format is (batch,actions,atoms)
-    def get_q_values(self, prediction):
-        return np.dot(prediction, self.z_values)
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        # for the action we actually took, the error is calculated by the atoms distribution
-        # for all other actions, the error is 0
-        distributed_q_st_plus_1 = self.main_network.target_network.predict(next_states)
-        # initialize with the current prediction so that we will
-        TD_targets = self.main_network.online_network.predict(current_states)
-
-        # only update the action that we have actually done in this transition
-        target_actions = np.argmax(self.get_q_values(distributed_q_st_plus_1), axis=1)
-        m = np.zeros((self.tp.batch_size, self.z_values.size))
-
-        batches = np.arange(self.tp.batch_size)
-        for j in range(self.z_values.size):
-            tzj = np.fmax(np.fmin(rewards + (1.0 - game_overs) * self.tp.agent.discount * self.z_values[j],
-                                self.z_values[self.z_values.size - 1]),
-                                self.z_values[0])
-            bj = (tzj - self.z_values[0])/(self.z_values[1] - self.z_values[0])
-            u = (np.ceil(bj)).astype(int)
-            l = (np.floor(bj)).astype(int)
-            m[batches, l] = m[batches, l] + (distributed_q_st_plus_1[batches, target_actions, j] * (u - bj))
-            m[batches, u] = m[batches, u] + (distributed_q_st_plus_1[batches, target_actions, j] * (bj - l))
-        # total_loss = cross entropy between actual result above and predicted result for the given action
-        TD_targets[batches, actions] = m
-
-        result = self.main_network.train_and_sync_networks(current_states, TD_targets)
-        total_loss = result[0]
-
-        return total_loss
-
diff --git a/agents/clipped_ppo_agent.py b/agents/clipped_ppo_agent.py
deleted file mode 100644
index ad066ae..0000000
--- a/agents/clipped_ppo_agent.py
+++ /dev/null
@@ -1,212 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.actor_critic_agent import *
-from random import shuffle
-
-
-# Clipped Proximal Policy Optimization - https://arxiv.org/abs/1707.06347
-class ClippedPPOAgent(ActorCriticAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ActorCriticAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id,
-                                  create_target_network=True)
-        # signals definition
-        self.value_loss = Signal('Value Loss')
-        self.signals.append(self.value_loss)
-        self.policy_loss = Signal('Policy Loss')
-        self.signals.append(self.policy_loss)
-        self.total_kl_divergence_during_training_process = 0.0
-        self.unclipped_grads = Signal('Grads (unclipped)')
-        self.signals.append(self.unclipped_grads)
-        self.value_targets = Signal('Value Targets')
-        self.signals.append(self.value_targets)
-        self.kl_divergence = Signal('KL Divergence')
-        self.signals.append(self.kl_divergence)
-
-    def fill_advantages(self, batch):
-        current_states, next_states, actions, rewards, game_overs, total_return = self.extract_batch(batch)
-
-        current_state_values = self.main_network.online_network.predict(current_states)[0]
-        current_state_values = current_state_values.squeeze()
-        self.state_values.add_sample(current_state_values)
-
-        # calculate advantages
-        advantages = []
-        value_targets = []
-        if self.policy_gradient_rescaler == PolicyGradientRescaler.A_VALUE:
-            advantages = total_return - current_state_values
-        elif self.policy_gradient_rescaler == PolicyGradientRescaler.GAE:
-            # get bootstraps
-            episode_start_idx = 0
-            advantages = np.array([])
-            value_targets = np.array([])
-            for idx, game_over in enumerate(game_overs):
-                if game_over:
-                    # get advantages for the rollout
-                    value_bootstrapping = np.zeros((1,))
-                    rollout_state_values = np.append(current_state_values[episode_start_idx:idx+1], value_bootstrapping)
-
-                    rollout_advantages, gae_based_value_targets = \
-                        self.get_general_advantage_estimation_values(rewards[episode_start_idx:idx+1],
-                                                                     rollout_state_values)
-                    episode_start_idx = idx + 1
-                    advantages = np.append(advantages, rollout_advantages)
-                    value_targets = np.append(value_targets, gae_based_value_targets)
-        else:
-            screen.warning("WARNING: The requested policy gradient rescaler is not available")
-
-        # standardize
-        advantages = (advantages - np.mean(advantages)) / (np.std(advantages) + 1e-8)
-
-        for transition, advantage, value_target in zip(batch, advantages, value_targets):
-            transition.info['advantage'] = advantage
-            transition.info['gae_based_value_target'] = value_target
-
-        self.action_advantages.add_sample(advantages)
-
-    def train_network(self, dataset, epochs):
-        loss = []
-        for j in range(epochs):
-            loss = {
-                'total_loss': [],
-                'policy_losses': [],
-                'unclipped_grads': [],
-                'fetch_result': []
-            }
-            shuffle(dataset)
-            for i in range(int(len(dataset) / self.tp.batch_size)):
-                batch = dataset[i * self.tp.batch_size:(i + 1) * self.tp.batch_size]
-                current_states, _, actions, _, _, total_return = self.extract_batch(batch)
-
-                advantages = np.array([t.info['advantage'] for t in batch])
-                gae_based_value_targets = np.array([t.info['gae_based_value_target'] for t in batch])
-                if not self.tp.env_instance.discrete_controls and len(actions.shape) == 1:
-                    actions = np.expand_dims(actions, -1)
-
-                # get old policy probabilities and distribution
-                result = self.main_network.target_network.predict(current_states)
-                old_policy_distribution = result[1:]
-
-                # calculate gradients and apply on both the local policy network and on the global policy network
-                fetches = [self.main_network.online_network.output_heads[1].kl_divergence,
-                           self.main_network.online_network.output_heads[1].entropy]
-
-                total_return = np.expand_dims(total_return, -1)
-                value_targets = gae_based_value_targets if self.tp.agent.estimate_value_using_gae else total_return
-                inputs = copy.copy(current_states)
-                # TODO: why is this output 0 and not output 1?
-                inputs['output_0_0'] = actions
-                # TODO: does old_policy_distribution really need to be represented as a list?
-                # A: yes it does, in the event of discrete controls, it has just a mean
-                # otherwise, it has both a mean and standard deviation
-                for input_index, input in enumerate(old_policy_distribution):
-                    inputs['output_0_{}'.format(input_index + 1)] = input
-                total_loss, policy_losses, unclipped_grads, fetch_result =\
-                    self.main_network.online_network.accumulate_gradients(
-                        inputs, [total_return, advantages], additional_fetches=fetches)
-
-                self.value_targets.add_sample(value_targets)
-                if self.tp.distributed:
-                    self.main_network.apply_gradients_to_global_network()
-                    self.main_network.update_online_network()
-                else:
-                    self.main_network.apply_gradients_to_online_network()
-
-                self.main_network.online_network.reset_accumulated_gradients()
-
-                loss['total_loss'].append(total_loss)
-                loss['policy_losses'].append(policy_losses)
-                loss['unclipped_grads'].append(unclipped_grads)
-                loss['fetch_result'].append(fetch_result)
-
-                self.unclipped_grads.add_sample(unclipped_grads)
-
-            for key in loss.keys():
-                loss[key] = np.mean(loss[key], 0)
-
-            if self.tp.learning_rate_decay_rate != 0:
-                curr_learning_rate = self.main_network.online_network.get_variable_value(self.tp.learning_rate)
-                self.curr_learning_rate.add_sample(curr_learning_rate)
-            else:
-                curr_learning_rate = self.tp.learning_rate
-
-            # log training parameters
-            screen.log_dict(
-                OrderedDict([
-                    ("Surrogate loss", loss['policy_losses'][0]),
-                    ("KL divergence", loss['fetch_result'][0]),
-                    ("Entropy", loss['fetch_result'][1]),
-                    ("training epoch", j),
-                    ("learning_rate", curr_learning_rate)
-                ]),
-                prefix="Policy training"
-            )
-
-        self.total_kl_divergence_during_training_process = loss['fetch_result'][0]
-        self.entropy.add_sample(loss['fetch_result'][1])
-        self.kl_divergence.add_sample(loss['fetch_result'][0])
-        return policy_losses
-
-    def post_training_commands(self):
-
-        # clean memory
-        self.memory.clean()
-
-    def train(self):
-        self.main_network.sync()
-
-        dataset = self.memory.transitions
-
-        self.fill_advantages(dataset)
-
-        # take only the requested number of steps
-        dataset = dataset[:self.tp.agent.num_consecutive_playing_steps]
-
-        if self.tp.distributed and self.tp.agent.share_statistics_between_workers:
-            self.running_observation_stats.push(np.array([np.array(t.state['observation']) for t in dataset]))
-
-        losses = self.train_network(dataset, 10)
-        self.value_loss.add_sample(losses[0])
-        self.policy_loss.add_sample(losses[1])
-        self.update_log()  # should be done in order to update the data that has been accumulated * while not playing *
-        return np.append(losses[0], losses[1])
-
-    def choose_action(self, current_state, phase=RunPhase.TRAIN):
-        if self.env.discrete_controls:
-            # DISCRETE
-            _, action_values = self.main_network.online_network.predict(self.tf_input_state(current_state))
-            action_values = action_values.squeeze()
-
-            if phase == RunPhase.TRAIN:
-                action = self.exploration_policy.get_action(action_values)
-            else:
-                action = np.argmax(action_values)
-            action_info = {"action_probability": action_values[action]}
-            # self.entropy.add_sample(-np.sum(action_values * np.log(action_values)))
-        else:
-            # CONTINUOUS
-            _, action_values_mean, action_values_std = self.main_network.online_network.predict(self.tf_input_state(current_state))
-            action_values_mean = action_values_mean.squeeze()
-            action_values_std = action_values_std.squeeze()
-            if phase == RunPhase.TRAIN:
-                action = np.squeeze(np.random.randn(1, self.action_space_size) * action_values_std + action_values_mean)
-                # if self.current_episode % 5 == 0 and self.current_episode_steps_counter < 5:
-                #     print action
-            else:
-                action = action_values_mean
-            action_info = {"action_probability": action_values_mean}
-
-        return action, action_info
diff --git a/agents/ddpg_agent.py b/agents/ddpg_agent.py
deleted file mode 100644
index 425f1de..0000000
--- a/agents/ddpg_agent.py
+++ /dev/null
@@ -1,109 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.actor_critic_agent import *
-from configurations import *
-
-
-# Deep Deterministic Policy Gradients Network - https://arxiv.org/pdf/1509.02971.pdf
-class DDPGAgent(ActorCriticAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ActorCriticAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id,
-                                  create_target_network=True)
-        # define critic network
-        self.critic_network = self.main_network
-        # self.networks.append(self.critic_network)
-
-        # define actor network
-        tuning_parameters.agent.input_types = {'observation': InputTypes.Observation}
-        tuning_parameters.agent.output_types = [OutputTypes.Pi]
-        self.actor_network = NetworkWrapper(tuning_parameters, True, self.has_global, 'actor',
-                                            self.replicated_device, self.worker_device)
-        self.networks.append(self.actor_network)
-
-        self.q_values = Signal("Q")
-        self.signals.append(self.q_values)
-
-        self.reset_game(do_not_reset_env=True)
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        # TD error = r + discount*max(q_st_plus_1) - q_st
-        next_actions = self.actor_network.target_network.predict(next_states)
-        inputs = copy.copy(next_states)
-        inputs['action'] = next_actions
-        q_st_plus_1 = self.critic_network.target_network.predict(inputs)
-        TD_targets = np.expand_dims(rewards, -1) + \
-                     (1.0 - np.expand_dims(game_overs, -1)) * self.tp.agent.discount * q_st_plus_1
-
-        # get the gradients of the critic output with respect to the action
-        actions_mean = self.actor_network.online_network.predict(current_states)
-        critic_online_network = self.critic_network.online_network
-        # TODO: convert into call to predict, current method ignores lstm middleware for example
-        action_gradients = self.critic_network.sess.run(critic_online_network.gradients_wrt_inputs['action'],
-                                                        feed_dict=critic_online_network._feed_dict({
-                                                            **current_states,
-                                                            'action': actions_mean,
-                                                        }))[0]
-
-        # train the critic
-        if len(actions.shape) == 1:
-            actions = np.expand_dims(actions, -1)
-        result = self.critic_network.train_and_sync_networks({**current_states, 'action': actions}, TD_targets)
-        total_loss = result[0]
-
-        # apply the gradients from the critic to the actor
-        actor_online_network = self.actor_network.online_network
-        gradients = self.actor_network.sess.run(actor_online_network.weighted_gradients,
-                                                feed_dict=actor_online_network._feed_dict({
-                                                    **current_states,
-                                                    actor_online_network.gradients_weights_ph: -action_gradients,
-                                                }))
-        if self.actor_network.has_global:
-            self.actor_network.global_network.apply_gradients(gradients)
-            self.actor_network.update_online_network()
-        else:
-            self.actor_network.online_network.apply_gradients(gradients)
-
-        return total_loss
-
-    def train(self):
-        return Agent.train(self)
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        assert not self.env.discrete_controls, 'DDPG works only for continuous control problems'
-        result = self.actor_network.online_network.predict(self.tf_input_state(curr_state))
-        action_values = result[0].squeeze()
-
-        if phase == RunPhase.TRAIN:
-            action = self.exploration_policy.get_action(action_values)
-        else:
-            action = action_values
-
-        action = np.clip(action, self.env.action_space_low, self.env.action_space_high)
-
-        # get q value
-        action_batch = np.expand_dims(action, 0)
-        if type(action) != np.ndarray:
-            action_batch = np.array([[action]])
-        inputs = self.tf_input_state(curr_state)
-        inputs['action'] = action_batch
-        q_value = self.critic_network.online_network.predict(inputs)[0]
-        self.q_values.add_sample(q_value)
-        action_info = {"action_value": q_value}
-
-        return action, action_info
diff --git a/agents/ddqn_agent.py b/agents/ddqn_agent.py
deleted file mode 100644
index 838ae3f..0000000
--- a/agents/ddqn_agent.py
+++ /dev/null
@@ -1,42 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.value_optimization_agent import *
-
-
-# Double DQN - https://arxiv.org/abs/1509.06461
-class DDQNAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        selected_actions = np.argmax(self.main_network.online_network.predict(next_states), 1)
-        q_st_plus_1 = self.main_network.target_network.predict(next_states)
-        TD_targets = self.main_network.online_network.predict(current_states)
-
-        # initialize with the current prediction so that we will
-        #  only update the action that we have actually done in this transition
-        for i in range(self.tp.batch_size):
-            TD_targets[i, actions[i]] = rewards[i] \
-                                        + (1.0 - game_overs[i]) * self.tp.agent.discount * q_st_plus_1[i][
-                selected_actions[i]]
-
-        result = self.main_network.train_and_sync_networks(current_states, TD_targets)
-        total_loss = result[0]
-
-        return total_loss
diff --git a/agents/dfp_agent.py b/agents/dfp_agent.py
deleted file mode 100644
index 8f98b94..0000000
--- a/agents/dfp_agent.py
+++ /dev/null
@@ -1,86 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.agent import *
-
-
-# Direct Future Prediction Agent - http://vladlen.info/papers/learning-to-act.pdf
-class DFPAgent(Agent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        Agent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.current_goal = self.tp.agent.goal_vector
-        self.main_network = NetworkWrapper(tuning_parameters, False, self.has_global, 'main',
-                                           self.replicated_device, self.worker_device)
-        self.networks.append(self.main_network)
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, total_returns = self.extract_batch(batch)
-
-        # create the inputs for the network
-        input = current_states
-        input['goal'] = np.repeat(np.expand_dims(self.current_goal, 0), self.tp.batch_size, 0)
-
-        # get the current outputs of the network
-        targets = self.main_network.online_network.predict(input)
-
-        # change the targets for the taken actions
-        for i in range(self.tp.batch_size):
-            targets[i, actions[i]] = batch[i].info['future_measurements'].flatten()
-
-        result = self.main_network.train_and_sync_networks(input, targets)
-        total_loss = result[0]
-
-        return total_loss
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        # convert to batch so we can run it through the network
-        observation = np.expand_dims(np.array(curr_state['observation']), 0)
-        measurements = np.expand_dims(np.array(curr_state['measurements']), 0)
-        goal = np.expand_dims(self.current_goal, 0)
-
-        # predict the future measurements
-        measurements_future_prediction = self.main_network.online_network.predict({
-            "observation": observation,
-            "measurements": measurements,
-            "goal": goal})[0]
-        action_values = np.zeros((self.action_space_size,))
-        num_steps_used_for_objective = len(self.tp.agent.future_measurements_weights)
-
-        # calculate the score of each action by multiplying it's future measurements with the goal vector
-        for action_idx in range(self.action_space_size):
-            action_measurements = measurements_future_prediction[action_idx]
-            action_measurements = np.reshape(action_measurements,
-                                             (self.tp.agent.num_predicted_steps_ahead, self.measurements_size[0]))
-            future_steps_values = np.dot(action_measurements, self.current_goal)
-            action_values[action_idx] = np.dot(future_steps_values[-num_steps_used_for_objective:],
-                                               self.tp.agent.future_measurements_weights)
-
-        # choose action according to the exploration policy and the current phase (evaluating or training the agent)
-        if phase == RunPhase.TRAIN:
-            action = self.exploration_policy.get_action(action_values)
-        else:
-            action = np.argmax(action_values)
-
-        action_values = action_values.squeeze()
-
-        # store information for plotting interactively (actual plotting is done in agent)
-        if self.tp.visualization.plot_action_values_online:
-            for idx, action_name in enumerate(self.env.actions_description):
-                self.episode_running_info[action_name].append(action_values[idx])
-
-        action_info = {"action_probability": 0, "action_value": action_values[action]}
-
-        return action, action_info
diff --git a/agents/distributional_dqn_agent.py b/agents/distributional_dqn_agent.py
deleted file mode 100644
index d7c0088..0000000
--- a/agents/distributional_dqn_agent.py
+++ /dev/null
@@ -1,60 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.value_optimization_agent import *
-
-
-# Distributional Deep Q Network - https://arxiv.org/pdf/1707.06887.pdf
-class DistributionalDQNAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.z_values = np.linspace(self.tp.agent.v_min, self.tp.agent.v_max, self.tp.agent.atoms)
-
-    # prediction's format is (batch,actions,atoms)
-    def get_q_values(self, prediction):
-        return np.dot(prediction, self.z_values)
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        # for the action we actually took, the error is calculated by the atoms distribution
-        # for all other actions, the error is 0
-        distributed_q_st_plus_1 = self.main_network.target_network.predict(next_states)
-        # initialize with the current prediction so that we will
-        TD_targets = self.main_network.online_network.predict(current_states)
-
-        # only update the action that we have actually done in this transition
-        target_actions = np.argmax(self.get_q_values(distributed_q_st_plus_1), axis=1)
-        m = np.zeros((self.tp.batch_size, self.z_values.size))
-
-        batches = np.arange(self.tp.batch_size)
-        for j in range(self.z_values.size):
-            tzj = np.fmax(np.fmin(rewards + (1.0 - game_overs) * self.tp.agent.discount * self.z_values[j],
-                                self.z_values[self.z_values.size - 1]),
-                                self.z_values[0])
-            bj = (tzj - self.z_values[0])/(self.z_values[1] - self.z_values[0])
-            u = (np.ceil(bj)).astype(int)
-            l = (np.floor(bj)).astype(int)
-            m[batches, l] = m[batches, l] + (distributed_q_st_plus_1[batches, target_actions, j] * (u - bj))
-            m[batches, u] = m[batches, u] + (distributed_q_st_plus_1[batches, target_actions, j] * (bj - l))
-        # total_loss = cross entropy between actual result above and predicted result for the given action
-        TD_targets[batches, actions] = m
-
-        result = self.main_network.train_and_sync_networks(current_states, TD_targets)
-        total_loss = result[0]
-
-        return total_loss
-
diff --git a/agents/dqn_agent.py b/agents/dqn_agent.py
deleted file mode 100644
index 70c0c7d..0000000
--- a/agents/dqn_agent.py
+++ /dev/null
@@ -1,43 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.value_optimization_agent import *
-
-
-# Deep Q Network - https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf
-class DQNAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        # for the action we actually took, the error is:
-        # TD error = r + discount*max(q_st_plus_1) - q_st
-        # for all other actions, the error is 0
-        q_st_plus_1 = self.main_network.target_network.predict(next_states)
-        # initialize with the current prediction so that we will
-        TD_targets = self.main_network.online_network.predict(current_states)
-
-        #  only update the action that we have actually done in this transition
-        for i in range(self.tp.batch_size):
-            TD_targets[i, actions[i]] = rewards[i] + (1.0 - game_overs[i]) * self.tp.agent.discount * np.max(
-                q_st_plus_1[i], 0)
-
-        result = self.main_network.train_and_sync_networks(current_states, TD_targets)
-        total_loss = result[0]
-
-        return total_loss
diff --git a/agents/human_agent.py b/agents/human_agent.py
deleted file mode 100644
index c75c2a2..0000000
--- a/agents/human_agent.py
+++ /dev/null
@@ -1,67 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.agent import *
-import pygame
-
-
-class HumanAgent(Agent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        Agent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-
-        self.clock = pygame.time.Clock()
-        self.max_fps = int(self.tp.visualization.max_fps_for_human_control)
-
-        screen.log_title("Human Control Mode")
-        available_keys = self.env.get_available_keys()
-        if available_keys:
-            screen.log("Use keyboard keys to move. Press escape to quit. Available keys:")
-            screen.log("")
-            for action, key in self.env.get_available_keys():
-                screen.log("\t- {}: {}".format(action, key))
-            screen.separator()
-
-    def train(self):
-        return 0
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        action = self.env.get_action_from_user()
-
-        # keep constant fps
-        self.clock.tick(self.max_fps)
-
-        if not self.env.renderer.is_open:
-            self.save_replay_buffer_and_exit()
-
-        return action, {"action_value": 0}
-
-    def save_replay_buffer_and_exit(self):
-        replay_buffer_path = os.path.join(logger.experiments_path, 'replay_buffer.p')
-        self.memory.tp = None
-        to_pickle(self.memory, replay_buffer_path)
-        screen.log_title("Replay buffer was stored in {}".format(replay_buffer_path))
-        exit()
-
-    def log_to_screen(self, phase):
-        # log to screen
-        screen.log_dict(
-            OrderedDict([
-                ("Episode", self.current_episode),
-                ("total reward", self.total_reward_in_current_episode),
-                ("steps", self.total_steps_counter)
-            ]),
-            prefix="Recording"
-        )
diff --git a/agents/imitation_agent.py b/agents/imitation_agent.py
deleted file mode 100644
index f893fbe..0000000
--- a/agents/imitation_agent.py
+++ /dev/null
@@ -1,65 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.agent import *
-
-
-# Imitation Agent
-class ImitationAgent(Agent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        Agent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.main_network = NetworkWrapper(tuning_parameters, False, self.has_global, 'main',
-                                           self.replicated_device, self.worker_device)
-        self.networks.append(self.main_network)
-        self.imitation = True
-
-    def extract_action_values(self, prediction):
-        return prediction.squeeze()
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        # convert to batch so we can run it through the network
-        prediction = self.main_network.online_network.predict(self.tf_input_state(curr_state))
-
-        # get action values and extract the best action from it
-        action_values = self.extract_action_values(prediction)
-        if self.env.discrete_controls:
-            # DISCRETE
-            # action = np.argmax(action_values)
-            action = self.evaluation_exploration_policy.get_action(action_values)
-            action_value = {"action_probability": action_values[action]}
-        else:
-            # CONTINUOUS
-            action = action_values
-            action_value = {}
-
-        return action, action_value
-
-    def log_to_screen(self, phase):
-        # log to screen
-        if phase == RunPhase.TRAIN:
-            # for the training phase - we log during the episode to visualize the progress in training
-            screen.log_dict(
-                OrderedDict([
-                    ("Worker", self.task_id),
-                    ("Episode", self.current_episode),
-                    ("Loss", self.loss.values[-1]),
-                    ("Training iteration", self.training_iteration)
-                ]),
-                prefix="Training"
-            )
-        else:
-            # for the evaluation phase - logging as in regular RL
-            Agent.log_to_screen(self, phase)
diff --git a/agents/mmc_agent.py b/agents/mmc_agent.py
deleted file mode 100644
index 2b5a2cb..0000000
--- a/agents/mmc_agent.py
+++ /dev/null
@@ -1,42 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.value_optimization_agent import *
-
-
-class MixedMonteCarloAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.mixing_rate = tuning_parameters.agent.monte_carlo_mixing_rate
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, total_return = self.extract_batch(batch)
-
-        TD_targets = self.main_network.online_network.predict(current_states)
-        selected_actions = np.argmax(self.main_network.online_network.predict(next_states), 1)
-        q_st_plus_1 = self.main_network.target_network.predict(next_states)
-        # initialize with the current prediction so that we will
-        #  only update the action that we have actually done in this transition
-        for i in range(self.tp.batch_size):
-            one_step_target = rewards[i] + (1.0 - game_overs[i]) * self.tp.agent.discount * q_st_plus_1[i][
-                selected_actions[i]]
-            monte_carlo_target = total_return[i]
-            TD_targets[i, actions[i]] = (1 - self.mixing_rate) * one_step_target + self.mixing_rate * monte_carlo_target
-
-        result = self.main_network.train_and_sync_networks(current_states, TD_targets)
-        total_loss = result[0]
-
-        return total_loss
diff --git a/agents/n_step_q_agent.py b/agents/n_step_q_agent.py
deleted file mode 100644
index 5a74fb5..0000000
--- a/agents/n_step_q_agent.py
+++ /dev/null
@@ -1,88 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-import numpy as np
-import scipy.signal
-
-from agents.value_optimization_agent import ValueOptimizationAgent
-from agents.policy_optimization_agent import PolicyOptimizationAgent
-from logger import logger
-from utils import Signal, last_sample
-
-
-# N Step Q Learning Agent - https://arxiv.org/abs/1602.01783
-class NStepQAgent(ValueOptimizationAgent, PolicyOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id, create_target_network=True)
-        self.last_gradient_update_step_idx = 0
-        self.q_values = Signal('Q Values')
-        self.unclipped_grads = Signal('Grads (unclipped)')
-        self.value_loss = Signal('Value Loss')
-        self.signals.append(self.q_values)
-        self.signals.append(self.unclipped_grads)
-        self.signals.append(self.value_loss)
-
-    def learn_from_batch(self, batch):
-        # batch contains a list of episodes to learn from
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        # get the values for the current states
-        state_value_head_targets = self.main_network.online_network.predict(current_states)
-
-        # the targets for the state value estimator
-        num_transitions = len(game_overs)
-
-        if self.tp.agent.targets_horizon == '1-Step':
-            # 1-Step Q learning
-            q_st_plus_1 = self.main_network.target_network.predict(next_states)
-
-            for i in reversed(range(num_transitions)):
-                state_value_head_targets[i][actions[i]] = \
-                    rewards[i] + (1.0 - game_overs[i]) * self.tp.agent.discount * np.max(q_st_plus_1[i], 0)
-
-        elif self.tp.agent.targets_horizon == 'N-Step':
-            # N-Step Q learning
-            if game_overs[-1]:
-                R = 0
-            else:
-                R = np.max(self.main_network.target_network.predict(last_sample(next_states)))
-
-            for i in reversed(range(num_transitions)):
-                R = rewards[i] + self.tp.agent.discount * R
-                state_value_head_targets[i][actions[i]] = R
-
-        else:
-            assert True, 'The available values for targets_horizon are: 1-Step, N-Step'
-
-        # train
-        result = self.main_network.online_network.accumulate_gradients(current_states, [state_value_head_targets])
-
-        # logging
-        total_loss, losses, unclipped_grads = result[:3]
-        self.unclipped_grads.add_sample(unclipped_grads)
-        self.value_loss.add_sample(losses[0])
-
-        return total_loss
-
-    def train(self):
-        # update the target network of every network that has a target network
-        if self.total_steps_counter % self.tp.agent.num_steps_between_copying_online_weights_to_target == 0:
-            for network in self.networks:
-                network.update_target_network(self.tp.agent.rate_for_copying_weights_to_target)
-            logger.create_signal_value('Update Target Network', 1)
-        else:
-            logger.create_signal_value('Update Target Network', 0, overwrite=False)
-
-        return PolicyOptimizationAgent.train(self)
diff --git a/agents/naf_agent.py b/agents/naf_agent.py
deleted file mode 100644
index 65ca83c..0000000
--- a/agents/naf_agent.py
+++ /dev/null
@@ -1,81 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import numpy as np
-
-from agents.value_optimization_agent import ValueOptimizationAgent
-from utils import RunPhase, Signal
-
-
-# Normalized Advantage Functions - https://arxiv.org/pdf/1603.00748.pdf
-class NAFAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.l_values = Signal("L")
-        self.a_values = Signal("Advantage")
-        self.mu_values = Signal("Action")
-        self.v_values = Signal("V")
-        self.signals += [self.l_values, self.a_values, self.mu_values, self.v_values]
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        # TD error = r + discount*v_st_plus_1 - q_st
-        v_st_plus_1 = self.main_network.target_network.predict(
-            next_states,
-            self.main_network.target_network.output_heads[0].V,
-            squeeze_output=False,
-        )
-        TD_targets = np.expand_dims(rewards, -1) + (1.0 - np.expand_dims(game_overs, -1)) * self.tp.agent.discount * v_st_plus_1
-
-        if len(actions.shape) == 1:
-            actions = np.expand_dims(actions, -1)
-
-        result = self.main_network.train_and_sync_networks({**current_states, 'output_0_0': actions}, TD_targets)
-        total_loss = result[0]
-
-        return total_loss
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        assert not self.env.discrete_controls, 'NAF works only for continuous control problems'
-
-        # convert to batch so we can run it through the network
-        # observation = np.expand_dims(np.array(curr_state['observation']), 0)
-        naf_head = self.main_network.online_network.output_heads[0]
-        action_values = self.main_network.online_network.predict(
-            self.tf_input_state(curr_state),
-            outputs=naf_head.mu,
-            squeeze_output=False,
-        )
-        if phase == RunPhase.TRAIN:
-            action = self.exploration_policy.get_action(action_values)
-        else:
-            action = action_values
-
-        Q, L, A, mu, V = self.main_network.online_network.predict(
-            {**self.tf_input_state(curr_state), 'output_0_0': action_values},
-            outputs=[naf_head.Q, naf_head.L, naf_head.A, naf_head.mu, naf_head.V],
-        )
-
-        # store the q values statistics for logging
-        self.q_values.add_sample(Q)
-        self.l_values.add_sample(L)
-        self.a_values.add_sample(A)
-        self.mu_values.add_sample(mu)
-        self.v_values.add_sample(V)
-
-        action_value = {"action_value": Q}
-        return action, action_value
diff --git a/agents/nec_agent.py b/agents/nec_agent.py
deleted file mode 100644
index a327db4..0000000
--- a/agents/nec_agent.py
+++ /dev/null
@@ -1,96 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import numpy as np
-import os, pickle
-from agents.value_optimization_agent import ValueOptimizationAgent
-from logger import screen
-from utils import RunPhase
-
-
-# Neural Episodic Control - https://arxiv.org/pdf/1703.01988.pdf
-class NECAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id,
-                                        create_target_network=False)
-        self.current_episode_state_embeddings = []
-        self.training_started = False
-
-    def learn_from_batch(self, batch):
-        if not self.main_network.online_network.output_heads[0].DND.has_enough_entries(self.tp.agent.number_of_knn):
-            return 0
-        else:
-            if not self.training_started:
-                self.training_started = True
-                screen.log_title("Finished collecting initial entries in DND. Starting to train network...")
-
-        current_states, next_states, actions, rewards, game_overs, total_return = self.extract_batch(batch)
-
-        TD_targets = self.main_network.online_network.predict(current_states)
-
-        #  only update the action that we have actually done in this transition
-        for i in range(self.tp.batch_size):
-            TD_targets[i, actions[i]] = total_return[i]
-
-        # train the neural network
-        result = self.main_network.train_and_sync_networks(current_states, TD_targets)
-
-        total_loss = result[0]
-
-        return total_loss
-
-    def act(self, phase=RunPhase.TRAIN):
-        if self.in_heatup:
-            # get embedding in heatup (otherwise we get it through choose_action)
-            embedding = self.main_network.online_network.predict(
-                self.tf_input_state(self.curr_state),
-                outputs=self.main_network.online_network.state_embedding)
-            self.current_episode_state_embeddings.append(embedding)
-
-        return super().act(phase)
-
-    def get_prediction(self, curr_state):
-        # get the actions q values and the state embedding
-        embedding, actions_q_values = self.main_network.online_network.predict(
-            self.tf_input_state(curr_state),
-            outputs=[self.main_network.online_network.state_embedding,
-                     self.main_network.online_network.output_heads[0].output]
-        )
-
-        # store the state embedding for inserting it to the DND later
-        self.current_episode_state_embeddings.append(embedding.squeeze())
-        actions_q_values = actions_q_values[0][0]
-        return actions_q_values
-
-    def reset_game(self, do_not_reset_env=False):
-        super().reset_game(do_not_reset_env)
-
-        # get the last full episode that we have collected
-        episode = self.memory.get_last_complete_episode()
-        if episode is not None:
-            # the indexing is only necessary because the heatup can end in the middle of an episode
-            # this won't be required after fixing this so that when the heatup is ended, the episode is closed
-            returns = episode.get_transitions_attribute('total_return')[:len(self.current_episode_state_embeddings)]
-            actions = episode.get_transitions_attribute('action')[:len(self.current_episode_state_embeddings)]
-            self.main_network.online_network.output_heads[0].DND.add(self.current_episode_state_embeddings,
-                                                                     actions, returns)
-
-        self.current_episode_state_embeddings = []
-
-    def save_model(self, model_id):
-        self.main_network.save_model(model_id)
-        with open(os.path.join(self.tp.save_model_dir, str(model_id) + '.dnd'), 'wb') as f:
-            pickle.dump(self.main_network.online_network.output_heads[0].DND, f, pickle.HIGHEST_PROTOCOL)
diff --git a/agents/pal_agent.py b/agents/pal_agent.py
deleted file mode 100644
index 68ff675..0000000
--- a/agents/pal_agent.py
+++ /dev/null
@@ -1,65 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.value_optimization_agent import *
-
-
-# Persistent Advantage Learning - https://arxiv.org/pdf/1512.04860.pdf
-class PALAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.alpha = tuning_parameters.agent.pal_alpha
-        self.persistent = tuning_parameters.agent.persistent_advantage_learning
-        self.monte_carlo_mixing_rate = tuning_parameters.agent.monte_carlo_mixing_rate
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, total_return = self.extract_batch(batch)
-
-        selected_actions = np.argmax(self.main_network.online_network.predict(next_states), 1)
-
-        # next state values
-        q_st_plus_1_target = self.main_network.target_network.predict(next_states)
-        v_st_plus_1_target = np.max(q_st_plus_1_target, 1)
-
-        # current state values according to online network
-        q_st_online = self.main_network.online_network.predict(current_states)
-
-        # current state values according to target network
-        q_st_target = self.main_network.target_network.predict(current_states)
-        v_st_target = np.max(q_st_target, 1)
-
-        # calculate TD error
-        TD_targets = np.copy(q_st_online)
-        for i in range(self.tp.batch_size):
-            TD_targets[i, actions[i]] = rewards[i] + (1.0 - game_overs[i]) * self.tp.agent.discount * \
-                                                     q_st_plus_1_target[i][selected_actions[i]]
-            advantage_learning_update = v_st_target[i] - q_st_target[i, actions[i]]
-            next_advantage_learning_update = v_st_plus_1_target[i] - q_st_plus_1_target[i, selected_actions[i]]
-            # Persistent Advantage Learning or Regular Advantage Learning
-            if self.persistent:
-                TD_targets[i, actions[i]] -= self.alpha * min(advantage_learning_update, next_advantage_learning_update)
-            else:
-                TD_targets[i, actions[i]] -= self.alpha * advantage_learning_update
-
-            # mixing monte carlo updates
-            monte_carlo_target = total_return[i]
-            TD_targets[i, actions[i]] = (1 - self.monte_carlo_mixing_rate) * TD_targets[i, actions[i]] \
-                                        + self.monte_carlo_mixing_rate * monte_carlo_target
-
-        result = self.main_network.train_and_sync_networks(current_states, TD_targets)
-        total_loss = result[0]
-
-        return total_loss
diff --git a/agents/policy_gradients_agent.py b/agents/policy_gradients_agent.py
deleted file mode 100644
index 3a592d1..0000000
--- a/agents/policy_gradients_agent.py
+++ /dev/null
@@ -1,93 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.policy_optimization_agent import *
-import numpy as np
-from logger import *
-import tensorflow as tf
-try:
-    import matplotlib.pyplot as plt
-except:
-    from logger import failed_imports
-    failed_imports.append("matplotlib")
-
-from utils import *
-
-
-class PolicyGradientsAgent(PolicyOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        PolicyOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.returns_mean = Signal('Returns Mean')
-        self.returns_variance = Signal('Returns Variance')
-        self.signals.append(self.returns_mean)
-        self.signals.append(self.returns_variance)
-        self.last_gradient_update_step_idx = 0
-
-    def learn_from_batch(self, batch):
-        # batch contains a list of episodes to learn from
-        current_states, next_states, actions, rewards, game_overs, total_returns = self.extract_batch(batch)
-
-        for i in reversed(range(len(total_returns))):
-            if self.policy_gradient_rescaler == PolicyGradientRescaler.TOTAL_RETURN:
-                total_returns[i] = total_returns[0]
-            elif self.policy_gradient_rescaler == PolicyGradientRescaler.FUTURE_RETURN:
-                # just take the total return as it is
-                pass
-            elif self.policy_gradient_rescaler == PolicyGradientRescaler.FUTURE_RETURN_NORMALIZED_BY_EPISODE:
-                # we can get a single transition episode while playing Doom Basic, causing the std to be 0
-                if self.std_discounted_return != 0:
-                    total_returns[i] = (total_returns[i] - self.mean_discounted_return) / self.std_discounted_return
-                else:
-                    total_returns[i] = 0
-            elif self.policy_gradient_rescaler == PolicyGradientRescaler.FUTURE_RETURN_NORMALIZED_BY_TIMESTEP:
-                total_returns[i] -= self.mean_return_over_multiple_episodes[i]
-            else:
-                screen.warning("WARNING: The requested policy gradient rescaler is not available")
-
-        targets = total_returns
-        if not self.env.discrete_controls and len(actions.shape) < 2:
-            actions = np.expand_dims(actions, -1)
-
-        self.returns_mean.add_sample(np.mean(total_returns))
-        self.returns_variance.add_sample(np.std(total_returns))
-
-        result = self.main_network.online_network.accumulate_gradients({**current_states, 'output_0_0': actions}, targets)
-        total_loss = result[0]
-
-        return total_loss
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        # convert to batch so we can run it through the network
-        if self.env.discrete_controls:
-            # DISCRETE
-            action_values = self.main_network.online_network.predict(self.tf_input_state(curr_state)).squeeze()
-            if phase == RunPhase.TRAIN:
-                action = self.exploration_policy.get_action(action_values)
-            else:
-                action = np.argmax(action_values)
-            action_value = {"action_probability": action_values[action]}
-            self.entropy.add_sample(-np.sum(action_values * np.log(action_values + eps)))
-        else:
-            # CONTINUOUS
-            result = self.main_network.online_network.predict(self.tf_input_state(curr_state))
-            action_values = result[0].squeeze()
-            if phase == RunPhase.TRAIN:
-                action = self.exploration_policy.get_action(action_values)
-            else:
-                action = action_values
-            action_value = {}
-
-        return action, action_value
diff --git a/agents/policy_optimization_agent.py b/agents/policy_optimization_agent.py
deleted file mode 100644
index be23760..0000000
--- a/agents/policy_optimization_agent.py
+++ /dev/null
@@ -1,123 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.agent import *
-from memories.memory import Episode
-
-
-class PolicyGradientRescaler(Enum):
-    TOTAL_RETURN = 0
-    FUTURE_RETURN = 1
-    FUTURE_RETURN_NORMALIZED_BY_EPISODE = 2
-    FUTURE_RETURN_NORMALIZED_BY_TIMESTEP = 3  # baselined
-    Q_VALUE = 4
-    A_VALUE = 5
-    TD_RESIDUAL = 6
-    DISCOUNTED_TD_RESIDUAL = 7
-    GAE = 8
-
-
-class PolicyOptimizationAgent(Agent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0, create_target_network=False):
-        Agent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.main_network = NetworkWrapper(tuning_parameters, create_target_network, self.has_global, 'main',
-                                           self.replicated_device, self.worker_device)
-        self.networks.append(self.main_network)
-
-        self.policy_gradient_rescaler = PolicyGradientRescaler().get(self.tp.agent.policy_gradient_rescaler)
-
-        # statistics for variance reduction
-        self.last_gradient_update_step_idx = 0
-        self.max_episode_length = 100000
-        self.mean_return_over_multiple_episodes = np.zeros(self.max_episode_length)
-        self.num_episodes_where_step_has_been_seen = np.zeros(self.max_episode_length)
-        self.entropy = Signal('Entropy')
-        self.signals.append(self.entropy)
-
-        self.reset_game(do_not_reset_env=True)
-
-    def log_to_screen(self, phase):
-        # log to screen
-        if self.current_episode > 0:
-            screen.log_dict(
-                OrderedDict([
-                    ("Worker", self.task_id),
-                    ("Episode", self.current_episode),
-                    ("total reward", self.total_reward_in_current_episode),
-                    ("steps", self.total_steps_counter),
-                    ("training iteration", self.training_iteration)
-                ]),
-                prefix=phase
-            )
-
-    def update_episode_statistics(self, episode):
-        episode_discounted_returns = []
-        for i in range(episode.length()):
-            transition = episode.get_transition(i)
-            episode_discounted_returns.append(transition.total_return)
-            self.num_episodes_where_step_has_been_seen[i] += 1
-            self.mean_return_over_multiple_episodes[i] -= self.mean_return_over_multiple_episodes[i] / \
-                                                          self.num_episodes_where_step_has_been_seen[i]
-            self.mean_return_over_multiple_episodes[i] += transition.total_return / \
-                                                          self.num_episodes_where_step_has_been_seen[i]
-        self.mean_discounted_return = np.mean(episode_discounted_returns)
-        self.std_discounted_return = np.std(episode_discounted_returns)
-
-    def train(self):
-        if self.memory.length() == 0:
-            return 0
-
-        episode = self.memory.get_episode(0)
-
-        # check if we should calculate gradients or skip
-        episode_ended = self.memory.num_complete_episodes() >= 1
-        num_steps_passed_since_last_update = episode.length() - self.last_gradient_update_step_idx
-        is_t_max_steps_passed = num_steps_passed_since_last_update >= self.tp.agent.num_steps_between_gradient_updates
-        if not (is_t_max_steps_passed or episode_ended):
-            return 0
-
-        total_loss = 0
-        if num_steps_passed_since_last_update > 0:
-
-            # we need to update the returns of the episode until now
-            episode.update_returns(self.tp.agent.discount)
-
-            # get t_max transitions or less if the we got to a terminal state
-            # will be used for both actor-critic and vanilla PG.
-            # # In order to get full episodes, Vanilla PG will set the end_idx to a very big value.
-            transitions = []
-            start_idx = self.last_gradient_update_step_idx
-            end_idx = episode.length()
-
-            for idx in range(start_idx, end_idx):
-                transitions.append(episode.get_transition(idx))
-            self.last_gradient_update_step_idx = end_idx
-
-            # update the statistics for the variance reduction techniques
-            if self.tp.agent.type == 'PolicyGradientsAgent':
-                self.update_episode_statistics(episode)
-
-            # accumulate the gradients and apply them once in every apply_gradients_every_x_episodes episodes
-            total_loss = self.learn_from_batch(transitions)
-            if self.current_episode % self.tp.agent.apply_gradients_every_x_episodes == 0:
-                self.main_network.apply_gradients_and_sync_networks()
-
-        # move the pointer to the next episode start and discard the episode. we use it only once
-        if episode_ended:
-            self.memory.remove_episode(0)
-            self.last_gradient_update_step_idx = 0
-
-        return total_loss
diff --git a/agents/ppo_agent.py b/agents/ppo_agent.py
deleted file mode 100644
index 4a37e69..0000000
--- a/agents/ppo_agent.py
+++ /dev/null
@@ -1,289 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.actor_critic_agent import *
-from random import shuffle
-
-
-# Proximal Policy Optimization - https://arxiv.org/pdf/1707.06347.pdf
-class PPOAgent(ActorCriticAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ActorCriticAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id,
-                                  create_target_network=True)
-        self.critic_network = self.main_network
-
-        # define the policy network
-        tuning_parameters.agent.input_types = {'observation': InputTypes.Observation}
-        tuning_parameters.agent.output_types = [OutputTypes.PPO]
-        tuning_parameters.agent.optimizer_type = 'Adam'
-        tuning_parameters.agent.l2_regularization = 0
-        self.policy_network = NetworkWrapper(tuning_parameters, True, self.has_global, 'policy',
-                                             self.replicated_device, self.worker_device)
-        self.networks.append(self.policy_network)
-
-        # signals definition
-        self.value_loss = Signal('Value Loss')
-        self.signals.append(self.value_loss)
-        self.policy_loss = Signal('Policy Loss')
-        self.signals.append(self.policy_loss)
-        self.kl_divergence = Signal('KL Divergence')
-        self.signals.append(self.kl_divergence)
-        self.total_kl_divergence_during_training_process = 0.0
-        self.unclipped_grads = Signal('Grads (unclipped)')
-        self.signals.append(self.unclipped_grads)
-
-        self.reset_game(do_not_reset_env=True)
-
-    def fill_advantages(self, batch):
-        current_states, next_states, actions, rewards, game_overs, total_return = self.extract_batch(batch)
-
-        # * Found not to have any impact *
-        # current_states_with_timestep = self.concat_state_and_timestep(batch)
-
-        current_state_values = self.critic_network.online_network.predict(current_states).squeeze()
-
-        # calculate advantages
-        advantages = []
-        if self.policy_gradient_rescaler == PolicyGradientRescaler.A_VALUE:
-            advantages = total_return - current_state_values
-        elif self.policy_gradient_rescaler == PolicyGradientRescaler.GAE:
-            # get bootstraps
-            episode_start_idx = 0
-            advantages = np.array([])
-            # current_state_values[game_overs] = 0
-            for idx, game_over in enumerate(game_overs):
-                if game_over:
-                    # get advantages for the rollout
-                    value_bootstrapping = np.zeros((1,))
-                    rollout_state_values = np.append(current_state_values[episode_start_idx:idx+1], value_bootstrapping)
-
-                    rollout_advantages, _ = \
-                        self.get_general_advantage_estimation_values(rewards[episode_start_idx:idx+1],
-                                                                     rollout_state_values)
-                    episode_start_idx = idx + 1
-                    advantages = np.append(advantages, rollout_advantages)
-        else:
-            screen.warning("WARNING: The requested policy gradient rescaler is not available")
-
-        # standardize
-        advantages = (advantages - np.mean(advantages)) / np.std(advantages)
-
-        for transition, advantage in zip(self.memory.transitions, advantages):
-            transition.info['advantage'] = advantage
-
-        self.action_advantages.add_sample(advantages)
-
-    def train_value_network(self, dataset, epochs):
-        loss = []
-        current_states, _, _, _, _, total_return = self.extract_batch(dataset)
-
-        # * Found not to have any impact *
-        # add a timestep to the observation
-        # current_states_with_timestep = self.concat_state_and_timestep(dataset)
-
-        total_return = np.expand_dims(total_return, -1)
-        mix_fraction = self.tp.agent.value_targets_mix_fraction
-        for j in range(epochs):
-            batch_size = len(dataset)
-            if self.critic_network.online_network.optimizer_type != 'LBFGS':
-                batch_size = self.tp.batch_size
-            for i in range(len(dataset) // batch_size):
-                # split to batches for first order optimization techniques
-                current_states_batch = {
-                    k: v[i * batch_size:(i + 1) * batch_size]
-                    for k, v in current_states.items()
-                }
-                total_return_batch = total_return[i * batch_size:(i + 1) * batch_size]
-                old_policy_values = force_list(self.critic_network.target_network.predict(
-                    current_states_batch).squeeze())
-                if self.critic_network.online_network.optimizer_type != 'LBFGS':
-                    targets = total_return_batch
-                else:
-                    current_values = self.critic_network.online_network.predict(current_states_batch)
-                    targets = current_values * (1 - mix_fraction) + total_return_batch * mix_fraction
-
-                inputs = copy.copy(current_states_batch)
-                for input_index, input in enumerate(old_policy_values):
-                    name = 'output_0_{}'.format(input_index)
-                    if name in self.critic_network.online_network.inputs:
-                        inputs[name] = input
-
-                value_loss = self.critic_network.online_network.accumulate_gradients(inputs, targets)
-                self.critic_network.apply_gradients_to_online_network()
-                if self.tp.distributed:
-                    self.critic_network.apply_gradients_to_global_network()
-                self.critic_network.online_network.reset_accumulated_gradients()
-
-                loss.append([value_loss[0]])
-        loss = np.mean(loss, 0)
-        return loss
-
-    def concat_state_and_timestep(self, dataset):
-        current_states_with_timestep = [np.append(transition.state['observation'], transition.info['timestep'])
-                                        for transition in dataset]
-        current_states_with_timestep = np.expand_dims(current_states_with_timestep, -1)
-        return current_states_with_timestep
-
-    def train_policy_network(self, dataset, epochs):
-        loss = []
-        for j in range(epochs):
-            loss = {
-                'total_loss': [],
-                'policy_losses': [],
-                'unclipped_grads': [],
-                'fetch_result': []
-            }
-            #shuffle(dataset)
-            for i in range(len(dataset) // self.tp.batch_size):
-                batch = dataset[i * self.tp.batch_size:(i + 1) * self.tp.batch_size]
-                current_states, _, actions, _, _, total_return = self.extract_batch(batch)
-                advantages = np.array([t.info['advantage'] for t in batch])
-                if not self.tp.env_instance.discrete_controls and len(actions.shape) == 1:
-                    actions = np.expand_dims(actions, -1)
-
-                # get old policy probabilities and distribution
-                old_policy = force_list(self.policy_network.target_network.predict(current_states))
-
-                # calculate gradients and apply on both the local policy network and on the global policy network
-                fetches = [self.policy_network.online_network.output_heads[0].kl_divergence,
-                           self.policy_network.online_network.output_heads[0].entropy]
-
-                inputs = copy.copy(current_states)
-                # TODO: why is this output 0 and not output 1?
-                inputs['output_0_0'] = actions
-                # TODO: does old_policy_distribution really need to be represented as a list?
-                # A: yes it does, in the event of discrete controls, it has just a mean
-                # otherwise, it has both a mean and standard deviation
-                for input_index, input in enumerate(old_policy):
-                    inputs['output_0_{}'.format(input_index + 1)] = input
-                total_loss, policy_losses, unclipped_grads, fetch_result =\
-                    self.policy_network.online_network.accumulate_gradients(
-                        inputs, [advantages], additional_fetches=fetches)
-
-                self.policy_network.apply_gradients_to_online_network()
-                if self.tp.distributed:
-                    self.policy_network.apply_gradients_to_global_network()
-
-                self.policy_network.online_network.reset_accumulated_gradients()
-
-                loss['total_loss'].append(total_loss)
-                loss['policy_losses'].append(policy_losses)
-                loss['unclipped_grads'].append(unclipped_grads)
-                loss['fetch_result'].append(fetch_result)
-
-                self.unclipped_grads.add_sample(unclipped_grads)
-
-            for key in loss.keys():
-                loss[key] = np.mean(loss[key], 0)
-
-            if self.tp.learning_rate_decay_rate != 0:
-                curr_learning_rate = self.main_network.online_network.get_variable_value(self.tp.learning_rate)
-                self.curr_learning_rate.add_sample(curr_learning_rate)
-            else:
-                curr_learning_rate = self.tp.learning_rate
-
-            # log training parameters
-            screen.log_dict(
-                OrderedDict([
-                    ("Surrogate loss", loss['policy_losses'][0]),
-                    ("KL divergence", loss['fetch_result'][0]),
-                    ("Entropy", loss['fetch_result'][1]),
-                    ("training epoch", j),
-                    ("learning_rate", curr_learning_rate)
-                ]),
-                prefix="Policy training"
-            )
-
-        self.total_kl_divergence_during_training_process = loss['fetch_result'][0]
-        self.entropy.add_sample(loss['fetch_result'][1])
-        self.kl_divergence.add_sample(loss['fetch_result'][0])
-        return loss['total_loss']
-
-    def update_kl_coefficient(self):
-        # John Schulman takes the mean kl divergence only over the last epoch which is strange but we will follow
-        # his implementation for now because we know it works well
-        screen.log_title("KL = {}".format(self.total_kl_divergence_during_training_process))
-
-        # update kl coefficient
-        kl_target = self.tp.agent.target_kl_divergence
-        kl_coefficient = self.policy_network.online_network.get_variable_value(
-            self.policy_network.online_network.output_heads[0].kl_coefficient)
-        new_kl_coefficient = kl_coefficient
-        if self.total_kl_divergence_during_training_process > 1.3 * kl_target:
-            # kl too high => increase regularization
-            new_kl_coefficient *= 1.5
-        elif self.total_kl_divergence_during_training_process < 0.7 * kl_target:
-            # kl too low => decrease regularization
-            new_kl_coefficient /= 1.5
-
-        # update the kl coefficient variable
-        if kl_coefficient != new_kl_coefficient:
-            self.policy_network.online_network.set_variable_value(
-                self.policy_network.online_network.output_heads[0].assign_kl_coefficient,
-                new_kl_coefficient,
-                self.policy_network.online_network.output_heads[0].kl_coefficient_ph)
-
-        screen.log_title("KL penalty coefficient change = {} -> {}".format(kl_coefficient, new_kl_coefficient))
-
-    def post_training_commands(self):
-        if self.tp.agent.use_kl_regularization:
-            self.update_kl_coefficient()
-
-        # clean memory
-        self.memory.clean()
-
-    def train(self):
-        self.policy_network.sync()
-        self.critic_network.sync()
-
-        dataset = self.memory.transitions
-
-        self.fill_advantages(dataset)
-
-        # take only the requested number of steps
-        dataset = dataset[:self.tp.agent.num_consecutive_playing_steps]
-
-        value_loss = self.train_value_network(dataset, 1)
-        policy_loss = self.train_policy_network(dataset, 10)
-
-        self.value_loss.add_sample(value_loss)
-        self.policy_loss.add_sample(policy_loss)
-        self.update_log()  # should be done in order to update the data that has been accumulated * while not playing *
-        return np.append(value_loss, policy_loss)
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        if self.env.discrete_controls:
-            # DISCRETE
-            action_values = self.policy_network.online_network.predict(self.tf_input_state(curr_state)).squeeze()
-
-            if phase == RunPhase.TRAIN:
-                action = self.exploration_policy.get_action(action_values)
-            else:
-                action = np.argmax(action_values)
-            action_info = {"action_probability": action_values[action]}
-            # self.entropy.add_sample(-np.sum(action_values * np.log(action_values)))
-        else:
-            # CONTINUOUS
-            action_values_mean, action_values_std = self.policy_network.online_network.predict(self.tf_input_state(curr_state))
-            action_values_mean = action_values_mean.squeeze()
-            action_values_std = action_values_std.squeeze()
-            if phase == RunPhase.TRAIN:
-                action = np.squeeze(np.random.randn(1, self.action_space_size) * action_values_std + action_values_mean)
-            else:
-                action = action_values_mean
-            action_info = {"action_probability": action_values_mean}
-
-        return action, action_info
diff --git a/agents/qr_dqn_agent.py b/agents/qr_dqn_agent.py
deleted file mode 100644
index 8888d18..0000000
--- a/agents/qr_dqn_agent.py
+++ /dev/null
@@ -1,66 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from agents.value_optimization_agent import *
-
-
-# Quantile Regression Deep Q Network - https://arxiv.org/pdf/1710.10044v1.pdf
-class QuantileRegressionDQNAgent(ValueOptimizationAgent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0):
-        ValueOptimizationAgent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.quantile_probabilities = np.ones(self.tp.agent.atoms) / float(self.tp.agent.atoms)
-
-    # prediction's format is (batch,actions,atoms)
-    def get_q_values(self, quantile_values):
-        return np.dot(quantile_values, self.quantile_probabilities)
-
-    def learn_from_batch(self, batch):
-        current_states, next_states, actions, rewards, game_overs, _ = self.extract_batch(batch)
-
-        # get the quantiles of the next states and current states
-        next_state_quantiles = self.main_network.target_network.predict(next_states)
-        current_quantiles = self.main_network.online_network.predict(current_states)
-
-        # get the optimal actions to take for the next states
-        target_actions = np.argmax(self.get_q_values(next_state_quantiles), axis=1)
-
-        # calculate the Bellman update
-        batch_idx = list(range(self.tp.batch_size))
-        rewards = np.expand_dims(rewards, -1)
-        game_overs = np.expand_dims(game_overs, -1)
-        TD_targets = rewards + (1.0 - game_overs) * self.tp.agent.discount \
-                               * next_state_quantiles[batch_idx, target_actions]
-
-        # get the locations of the selected actions within the batch for indexing purposes
-        actions_locations = [[b, a] for b, a in zip(batch_idx, actions)]
-
-        # calculate the cumulative quantile probabilities and reorder them to fit the sorted quantiles order
-        cumulative_probabilities = np.array(range(self.tp.agent.atoms+1))/float(self.tp.agent.atoms)  # tau_i
-        quantile_midpoints = 0.5*(cumulative_probabilities[1:] + cumulative_probabilities[:-1])  # tau^hat_i
-        quantile_midpoints = np.tile(quantile_midpoints, (self.tp.batch_size, 1))
-        sorted_quantiles = np.argsort(current_quantiles[batch_idx, actions])
-        for idx in range(self.tp.batch_size):
-            quantile_midpoints[idx, :] = quantile_midpoints[idx, sorted_quantiles[idx]]
-
-        # train
-        result = self.main_network.train_and_sync_networks({
-            **current_states,
-            'output_0_0': actions_locations,
-            'output_0_1': quantile_midpoints,
-        }, TD_targets)
-        total_loss = result[0]
-
-        return total_loss
diff --git a/agents/value_optimization_agent.py b/agents/value_optimization_agent.py
deleted file mode 100644
index 75708d7..0000000
--- a/agents/value_optimization_agent.py
+++ /dev/null
@@ -1,77 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import numpy as np
-
-from agents.agent import Agent
-from architectures.network_wrapper import NetworkWrapper
-from utils import RunPhase, Signal
-
-
-class ValueOptimizationAgent(Agent):
-    def __init__(self, env, tuning_parameters, replicated_device=None, thread_id=0, create_target_network=True):
-        Agent.__init__(self, env, tuning_parameters, replicated_device, thread_id)
-        self.main_network = NetworkWrapper(tuning_parameters, create_target_network, self.has_global, 'main',
-                                           self.replicated_device, self.worker_device)
-        self.networks.append(self.main_network)
-        self.q_values = Signal("Q")
-        self.signals.append(self.q_values)
-
-        self.reset_game(do_not_reset_env=True)
-
-    # Algorithms for which q_values are calculated from predictions will override this function
-    def get_q_values(self, prediction):
-        return prediction
-
-    def get_prediction(self, curr_state):
-        return self.main_network.online_network.predict(self.tf_input_state(curr_state))
-
-    def _validate_action(self, policy, action):
-        if np.array(action).shape != ():
-            raise ValueError((
-                'The exploration_policy {} returned a vector of actions '
-                'instead of a single action. ValueOptimizationAgents '
-                'require exploration policies which return a single action.'
-            ).format(policy.__class__.__name__))
-
-    def choose_action(self, curr_state, phase=RunPhase.TRAIN):
-        prediction = self.get_prediction(curr_state)
-        actions_q_values = self.get_q_values(prediction)
-
-        # choose action according to the exploration policy and the current phase (evaluating or training the agent)
-        if phase == RunPhase.TRAIN:
-            exploration_policy = self.exploration_policy
-        else:
-            exploration_policy = self.evaluation_exploration_policy
-
-        action = exploration_policy.get_action(actions_q_values)
-        self._validate_action(exploration_policy, action)
-
-        # this is for bootstrapped dqn
-        if type(actions_q_values) == list and len(actions_q_values) > 0:
-            actions_q_values = actions_q_values[self.exploration_policy.selected_head]
-        actions_q_values = actions_q_values.squeeze()
-
-        # store the q values statistics for logging
-        self.q_values.add_sample(actions_q_values)
-
-        # store information for plotting interactively (actual plotting is done in agent)
-        if self.tp.visualization.plot_action_values_online:
-            for idx, action_name in enumerate(self.env.actions_description):
-                self.episode_running_info[action_name].append(actions_q_values[idx])
-
-        action_value = {"action_value": actions_q_values[action], "max_action_value": np.max(actions_q_values)}
-        return action, action_value
diff --git a/architectures/__init__.py b/architectures/__init__.py
deleted file mode 100644
index cbf2ac5..0000000
--- a/architectures/__init__.py
+++ /dev/null
@@ -1,31 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from architectures.architecture import *
-from logger import failed_imports
-try:
-    from architectures.tensorflow_components.general_network import *
-    from architectures.tensorflow_components.architecture import *
-except ImportError:
-    failed_imports.append("TensorFlow")
-
-try:
-    from architectures.neon_components.general_network import *
-    from architectures.neon_components.architecture import *
-except ImportError:
-    failed_imports.append("Neon")
-
-from architectures.network_wrapper import *
\ No newline at end of file
diff --git a/architectures/neon_components/architecture.py b/architectures/neon_components/architecture.py
deleted file mode 100644
index de600c1..0000000
--- a/architectures/neon_components/architecture.py
+++ /dev/null
@@ -1,129 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import sys
-import copy
-from ngraph.frontends.neon import *
-import ngraph as ng
-from architectures.architecture import *
-import numpy as np
-from utils import *
-
-
-class NeonArchitecture(Architecture):
-    def __init__(self, tuning_parameters, name="", global_network=None, network_is_local=True):
-        Architecture.__init__(self, tuning_parameters, name)
-        assert tuning_parameters.agent.neon_support, 'Neon is not supported for this agent'
-        self.clip_error = tuning_parameters.clip_gradients
-        self.total_loss = None
-        self.epoch = 0
-        self.inputs = []
-        self.outputs = []
-        self.targets = []
-        self.losses = []
-
-        self.transformer = tuning_parameters.sess
-        self.network = self.get_model(tuning_parameters)
-        self.accumulated_gradients = []
-
-        # training and inference ops
-        train_output = ng.sequential([
-            self.optimizer(self.total_loss),
-            self.total_loss
-        ])
-        placeholders = self.inputs + self.targets
-        self.train_op = self.transformer.add_computation(
-            ng.computation(
-                train_output, *placeholders
-            )
-        )
-        self.predict_op = self.transformer.add_computation(
-            ng.computation(
-                self.outputs, self.inputs[0]
-            )
-        )
-
-        # update weights from array op
-        self.weights = [ng.placeholder(w.axes) for w in self.total_loss.variables()]
-        self.set_weights_ops = []
-        for target_variable, variable in zip(self.total_loss.variables(), self.weights):
-            self.set_weights_ops.append(self.transformer.add_computation(
-                ng.computation(
-                    ng.assign(target_variable, variable), variable
-                )
-            ))
-
-        # get weights op
-        self.get_variables = self.transformer.add_computation(
-            ng.computation(
-                self.total_loss.variables()
-            )
-        )
-
-    def predict(self, inputs):
-        batch_size = inputs.shape[0]
-
-        # move batch axis to the end
-        inputs = inputs.swapaxes(0, -1)
-        prediction = self.predict_op(inputs)  # TODO: problem with multiple inputs
-
-        if type(prediction) != tuple:
-            prediction = (prediction)
-
-        # process all the outputs from the network
-        output = []
-        for p in prediction:
-            output.append(p.transpose()[:batch_size].copy())
-
-        # if there is only one output then we don't need a list
-        if len(output) == 1:
-            output = output[0]
-        return output
-
-    def train_on_batch(self, inputs, targets):
-        loss = self.accumulate_gradients(inputs, targets)
-        self.apply_and_reset_gradients(self.accumulated_gradients)
-        return loss
-
-    def get_weights(self):
-        return self.get_variables()
-
-    def set_weights(self, weights, rate=1.0):
-        if rate != 1:
-            current_weights = self.get_weights()
-            updated_weights = [(1 - rate) * t + rate * o for t, o in zip(current_weights, weights)]
-        else:
-            updated_weights = weights
-        for update_function, variable in zip(self.set_weights_ops, updated_weights):
-            update_function(variable)
-
-    def accumulate_gradients(self, inputs, targets):
-        # Neon doesn't currently allow separating the grads calculation and grad apply operations
-        # so this feature is not currently available. instead we do a full training iteration
-        inputs = force_list(inputs)
-        targets = force_list(targets)
-
-        for idx, input in enumerate(inputs):
-            inputs[idx] = input.swapaxes(0, -1)
-
-        for idx, target in enumerate(targets):
-            targets[idx] = np.rollaxis(target, 0, len(target.shape))
-
-        all_inputs = inputs + targets
-
-        loss = np.mean(self.train_op(*all_inputs))
-
-        return [loss]
diff --git a/architectures/neon_components/embedders.py b/architectures/neon_components/embedders.py
deleted file mode 100644
index 5f594a3..0000000
--- a/architectures/neon_components/embedders.py
+++ /dev/null
@@ -1,88 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import ngraph.frontends.neon as neon
-import ngraph as ng
-from ngraph.util.names import name_scope
-
-
-class InputEmbedder(object):
-    def __init__(self, input_size, batch_size=None, activation_function=neon.Rectlin(), name="embedder"):
-        self.name = name
-        self.input_size = input_size
-        self.batch_size = batch_size
-        self.activation_function = activation_function
-        self.weights_init = neon.GlorotInit()
-        self.biases_init = neon.ConstantInit()
-        self.input = None
-        self.output = None
-
-    def __call__(self, prev_input_placeholder=None):
-        with name_scope(self.get_name()):
-            # create the input axes
-            axes = []
-            if len(self.input_size) == 2:
-                axis_names = ['H', 'W']
-            else:
-                axis_names = ['C', 'H', 'W']
-            for axis_size, axis_name in zip(self.input_size, axis_names):
-                axes.append(ng.make_axis(axis_size, name=axis_name))
-            batch_axis_full = ng.make_axis(self.batch_size, name='N')
-            input_axes = ng.make_axes(axes)
-
-            if prev_input_placeholder is None:
-                self.input = ng.placeholder(input_axes + [batch_axis_full])
-            else:
-                self.input = prev_input_placeholder
-            self._build_module()
-
-        return self.input, self.output(self.input)
-
-    def _build_module(self):
-        pass
-
-    def get_name(self):
-        return self.name
-
-
-class ImageEmbedder(InputEmbedder):
-    def __init__(self, input_size, batch_size=None, input_rescaler=255.0, activation_function=neon.Rectlin(), name="embedder"):
-        InputEmbedder.__init__(self, input_size, batch_size, activation_function, name)
-        self.input_rescaler = input_rescaler
-
-    def _build_module(self):
-        # image observation
-        self.output = neon.Sequential([
-            neon.Preprocess(functor=lambda x: x / self.input_rescaler),
-            neon.Convolution((8, 8, 32), strides=4, activation=self.activation_function,
-                             filter_init=self.weights_init, bias_init=self.biases_init),
-            neon.Convolution((4, 4, 64), strides=2, activation=self.activation_function,
-                             filter_init=self.weights_init, bias_init=self.biases_init),
-            neon.Convolution((3, 3, 64), strides=1, activation=self.activation_function,
-                             filter_init=self.weights_init, bias_init=self.biases_init)
-        ])
-
-
-class VectorEmbedder(InputEmbedder):
-    def __init__(self, input_size, batch_size=None, activation_function=neon.Rectlin(), name="embedder"):
-        InputEmbedder.__init__(self, input_size, batch_size, activation_function, name)
-
-    def _build_module(self):
-        # vector observation
-        self.output = neon.Sequential([
-                neon.Affine(nout=256, activation=self.activation_function,
-                            weight_init=self.weights_init, bias_init=self.biases_init)
-            ])
diff --git a/architectures/neon_components/general_network.py b/architectures/neon_components/general_network.py
deleted file mode 100644
index 99ac6e9..0000000
--- a/architectures/neon_components/general_network.py
+++ /dev/null
@@ -1,192 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from architectures.neon_components.embedders import *
-from architectures.neon_components.heads import *
-from architectures.neon_components.middleware import *
-from architectures.neon_components.architecture import *
-from configurations import InputTypes, OutputTypes, MiddlewareTypes
-
-
-class GeneralNeonNetwork(NeonArchitecture):
-    def __init__(self, tuning_parameters, name="", global_network=None, network_is_local=True):
-        self.global_network = global_network
-        self.network_is_local = network_is_local
-        self.num_heads_per_network = 1 if tuning_parameters.agent.use_separate_networks_per_head else \
-            len(tuning_parameters.agent.output_types)
-        self.num_networks = 1 if not tuning_parameters.agent.use_separate_networks_per_head else \
-            len(tuning_parameters.agent.output_types)
-        self.input_embedders = []
-        self.output_heads = []
-        self.activation_function = self.get_activation_function(
-            tuning_parameters.agent.hidden_layers_activation_function)
-
-        NeonArchitecture.__init__(self, tuning_parameters, name, global_network, network_is_local)
-
-    def get_activation_function(self, activation_function_string):
-        activation_functions = {
-            'relu': neon.Rectlin(),
-            'tanh': neon.Tanh(),
-            'sigmoid': neon.Logistic(),
-            'elu': neon.Explin(),
-            'selu': None,
-            'none': None
-        }
-        assert activation_function_string in activation_functions.keys(), \
-            "Activation function must be one of the following {}".format(activation_functions.keys())
-        return activation_functions[activation_function_string]
-
-    def get_input_embedder(self, embedder_type):
-        # the observation can be either an image or a vector
-        def get_observation_embedding(with_timestep=False):
-            if self.input_height > 1:
-                return ImageEmbedder((self.input_depth, self.input_height, self.input_width), self.batch_size,
-                                     name="observation")
-            else:
-                return VectorEmbedder((self.input_depth, self.input_width + int(with_timestep)), self.batch_size,
-                                      name="observation")
-
-        input_mapping = {
-            InputTypes.Observation: get_observation_embedding(),
-            InputTypes.Measurements: VectorEmbedder(self.measurements_size, self.batch_size, name="measurements"),
-            InputTypes.GoalVector: VectorEmbedder(self.measurements_size, self.batch_size, name="goal_vector"),
-            InputTypes.Action: VectorEmbedder((self.num_actions,), self.batch_size, name="action"),
-            InputTypes.TimedObservation: get_observation_embedding(with_timestep=True),
-        }
-        return input_mapping[embedder_type]
-
-    def get_middleware_embedder(self, middleware_type):
-        return {MiddlewareTypes.LSTM: None,   # LSTM over Neon is currently not supported in Coach
-                MiddlewareTypes.FC: FC_Embedder}.get(middleware_type)(self.activation_function)
-
-    def get_output_head(self, head_type, head_idx, loss_weight=1.):
-        output_mapping = {
-            OutputTypes.Q: QHead,
-            OutputTypes.DuelingQ: DuelingQHead,
-            OutputTypes.V: None, # Policy Optimization algorithms over Neon are currently not supported in Coach
-            OutputTypes.Pi: None,  # Policy Optimization algorithms over Neon are currently not supported in Coach
-            OutputTypes.MeasurementsPrediction: None, # DFP over Neon is currently not supported in Coach
-            OutputTypes.DNDQ: None,  # NEC over Neon is currently not supported in Coach
-            OutputTypes.NAF: None,  # NAF over Neon is currently not supported in Coach
-            OutputTypes.PPO: None, # PPO over Neon is currently not supported in Coach
-            OutputTypes.PPO_V: None  # PPO over Neon is currently not supported in Coach
-        }
-        return output_mapping[head_type](self.tp, head_idx, loss_weight, self.network_is_local)
-
-    def get_model(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        :return: A model
-        """
-        assert len(self.tp.agent.input_types) > 0, "At least one input type should be defined"
-        assert len(self.tp.agent.output_types) > 0, "At least one output type should be defined"
-        assert self.tp.agent.middleware_type is not None, "Exactly one middleware type should be defined"
-        assert len(self.tp.agent.loss_weights) > 0, "At least one loss weight should be defined"
-        assert len(self.tp.agent.output_types) == len(self.tp.agent.loss_weights), \
-            "Number of loss weights should match the number of output types"
-        local_network_in_distributed_training = self.global_network is not None and self.network_is_local
-
-        tuning_parameters.activation_function = self.activation_function
-        done_creating_input_placeholders = False
-
-        for network_idx in range(self.num_networks):
-            with name_scope('network_{}'.format(network_idx)):
-                ####################
-                # Input Embeddings #
-                ####################
-
-                state_embedding = []
-                for idx, input_type in enumerate(self.tp.agent.input_types):
-                    # get the class of the input embedder
-                    self.input_embedders.append(self.get_input_embedder(input_type))
-
-                    # in the case each head uses a different network, we still reuse the input placeholders
-                    prev_network_input_placeholder = self.inputs[idx] if done_creating_input_placeholders else None
-
-                    # create the input embedder instance and store the input placeholder and the embedding
-                    input_placeholder, embedding = self.input_embedders[-1](prev_network_input_placeholder)
-                    if len(self.inputs) < len(self.tp.agent.input_types):
-                        self.inputs.append(input_placeholder)
-                    state_embedding.append(embedding)
-
-                done_creating_input_placeholders = True
-
-                ##############
-                # Middleware #
-                ##############
-
-                state_embedding = ng.concat_along_axis(state_embedding, state_embedding[0].axes[0]) \
-                    if len(state_embedding) > 1 else state_embedding[0]
-                self.middleware_embedder = self.get_middleware_embedder(self.tp.agent.middleware_type)
-                _, self.state_embedding = self.middleware_embedder(state_embedding)
-
-                ################
-                # Output Heads #
-                ################
-
-                for head_idx in range(self.num_heads_per_network):
-                    for head_copy_idx in range(self.tp.agent.num_output_head_copies):
-                        if self.tp.agent.use_separate_networks_per_head:
-                            # if we use separate networks per head, then the head type corresponds top the network idx
-                            head_type_idx = network_idx
-                        else:
-                            # if we use a single network with multiple heads, then the head type is the current head idx
-                            head_type_idx = head_idx
-                        self.output_heads.append(self.get_output_head(self.tp.agent.output_types[head_type_idx],
-                                                                      head_copy_idx,
-                                                                      self.tp.agent.loss_weights[head_type_idx]))
-                        if self.network_is_local:
-                            output, target_placeholder, input_placeholder = self.output_heads[-1](self.state_embedding)
-                            self.targets.extend(target_placeholder)
-                        else:
-                            output, input_placeholder = self.output_heads[-1](self.state_embedding)
-
-                        self.outputs.extend(output)
-                        self.inputs.extend(input_placeholder)
-
-        # Losses
-        self.losses = []
-        for output_head in self.output_heads:
-            self.losses += output_head.loss
-        self.total_loss = sum(self.losses)
-
-        # Learning rate
-        if self.tp.learning_rate_decay_rate != 0:
-            raise Exception("learning rate decay is not supported in neon")
-
-        # Optimizer
-        if local_network_in_distributed_training and \
-                hasattr(self.tp.agent, "shared_optimizer") and self.tp.agent.shared_optimizer:
-            # distributed training and this is the local network instantiation
-            self.optimizer = self.global_network.optimizer
-        else:
-            if tuning_parameters.agent.optimizer_type == 'Adam':
-                self.optimizer = neon.Adam(
-                    learning_rate=tuning_parameters.learning_rate,
-                    gradient_clip_norm=tuning_parameters.clip_gradients
-                )
-            elif tuning_parameters.agent.optimizer_type == 'RMSProp':
-                self.optimizer = neon.RMSProp(
-                    learning_rate=tuning_parameters.learning_rate,
-                    gradient_clip_norm=tuning_parameters.clip_gradients,
-                    decay_rate=0.9,
-                    epsilon=0.01
-                )
-            elif tuning_parameters.agent.optimizer_type == 'LBFGS':
-                raise Exception("LBFGS optimizer is not supported in neon")
-            else:
-                raise Exception("{} is not a valid optimizer type".format(tuning_parameters.agent.optimizer_type))
diff --git a/architectures/neon_components/heads.py b/architectures/neon_components/heads.py
deleted file mode 100644
index df49867..0000000
--- a/architectures/neon_components/heads.py
+++ /dev/null
@@ -1,194 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import ngraph as ng
-from ngraph.util.names import name_scope
-import ngraph.frontends.neon as neon
-import numpy as np
-from utils import force_list
-from architectures.neon_components.losses import *
-
-
-class Head(object):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        self.head_idx = head_idx
-        self.name = "head"
-        self.output = []
-        self.loss = []
-        self.loss_type = []
-        self.regularizations = []
-        self.loss_weight = force_list(loss_weight)
-        self.weights_init = neon.GlorotInit()
-        self.biases_init = neon.ConstantInit()
-        self.target = []
-        self.input = []
-        self.is_local = is_local
-        self.batch_size = tuning_parameters.batch_size
-
-    def __call__(self, input_layer):
-        """
-        Wrapper for building the module graph including scoping and loss creation
-        :param input_layer: the input to the graph
-        :return: the output of the last layer and the target placeholder
-        """
-        with name_scope(self.get_name()):
-            self._build_module(input_layer)
-
-            self.output = force_list(self.output)
-            self.target = force_list(self.target)
-            self.input = force_list(self.input)
-            self.loss_type = force_list(self.loss_type)
-            self.loss = force_list(self.loss)
-            self.regularizations = force_list(self.regularizations)
-            if self.is_local:
-               self.set_loss()
-
-        if self.is_local:
-            return self.output, self.target, self.input
-        else:
-            return self.output, self.input
-
-    def _build_module(self, input_layer):
-        """
-        Builds the graph of the module
-        :param input_layer: the input to the graph
-        :return: None
-        """
-        pass
-
-    def get_name(self):
-        """
-        Get a formatted name for the module
-        :return: the formatted name
-        """
-        return '{}_{}'.format(self.name, self.head_idx)
-
-    def set_loss(self):
-        """
-        Creates a target placeholder and loss function for each loss_type and regularization
-        :param loss_type: a tensorflow loss function
-        :param scope: the name scope to include the tensors in
-        :return: None
-        """
-        # add losses and target placeholder
-        for idx in range(len(self.loss_type)):
-            # output_axis = ng.make_axis(self.num_actions, name='q_values')
-            batch_axis_full = ng.make_axis(self.batch_size, name='N')
-            target = ng.placeholder(ng.make_axes([self.output[0].axes[0], batch_axis_full]))
-            self.target.append(target)
-            loss = self.loss_type[idx](self.target[-1], self.output[idx],
-                                       weights=self.loss_weight[idx], scope=self.get_name())
-            self.loss.append(loss)
-
-        # add regularizations
-        for regularization in self.regularizations:
-            self.loss.append(regularization)
-
-
-class QHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'q_values_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        if tuning_parameters.agent.replace_mse_with_huber_loss:
-            raise Exception("huber loss is not supported in neon")
-        else:
-            self.loss_type = mean_squared_error
-
-    def _build_module(self, input_layer):
-        # Standard Q Network
-        self.output = neon.Sequential([
-                neon.Affine(nout=self.num_actions,
-                            weight_init=self.weights_init, bias_init=self.biases_init)
-            ])(input_layer)
-
-
-class DuelingQHead(QHead):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        QHead.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-
-    def _build_module(self, input_layer):
-        # Dueling Network
-        # state value tower - V
-        output_axis = ng.make_axis(self.num_actions, name='q_values')
-
-        state_value = neon.Sequential([
-            neon.Affine(nout=256, activation=neon.Rectlin(),
-                        weight_init=self.weights_init, bias_init=self.biases_init),
-            neon.Affine(nout=1,
-                        weight_init=self.weights_init, bias_init=self.biases_init)
-        ])(input_layer)
-
-        # action advantage tower - A
-        action_advantage_unnormalized = neon.Sequential([
-            neon.Affine(nout=256, activation=neon.Rectlin(),
-                        weight_init=self.weights_init, bias_init=self.biases_init),
-            neon.Affine(axes=output_axis,
-                        weight_init=self.weights_init, bias_init=self.biases_init)
-        ])(input_layer)
-        action_advantage = action_advantage_unnormalized - ng.mean(action_advantage_unnormalized)
-
-        repeated_state_value = ng.expand_dims(ng.slice_along_axis(state_value, state_value.axes[0], 0), output_axis, 0)
-
-        # merge to state-action value function Q
-        self.output = repeated_state_value + action_advantage
-
-
-class MeasurementsPredictionHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'future_measurements_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        self.num_measurements = tuning_parameters.env.measurements_size[0] \
-            if tuning_parameters.env.measurements_size else 0
-        self.num_prediction_steps = tuning_parameters.agent.num_predicted_steps_ahead
-        self.multi_step_measurements_size = self.num_measurements * self.num_prediction_steps
-        if tuning_parameters.agent.replace_mse_with_huber_loss:
-            raise Exception("huber loss is not supported in neon")
-        else:
-            self.loss_type = mean_squared_error
-
-    def _build_module(self, input_layer):
-        # This is almost exactly the same as Dueling Network but we predict the future measurements for each action
-
-        multistep_measurements_size = self.measurements_size[0] * self.num_predicted_steps_ahead
-
-        # actions expectation tower (expectation stream) - E
-        with name_scope("expectation_stream"):
-            expectation_stream = neon.Sequential([
-                neon.Affine(nout=256, activation=neon.Rectlin(),
-                            weight_init=self.weights_init, bias_init=self.biases_init),
-                neon.Affine(nout=multistep_measurements_size,
-                            weight_init=self.weights_init, bias_init=self.biases_init)
-            ])(input_layer)
-
-        # action fine differences tower (action stream) - A
-        with name_scope("action_stream"):
-            action_stream_unnormalized = neon.Sequential([
-                neon.Affine(nout=256, activation=neon.Rectlin(),
-                            weight_init=self.weights_init, bias_init=self.biases_init),
-                neon.Affine(nout=self.num_actions * multistep_measurements_size,
-                            weight_init=self.weights_init, bias_init=self.biases_init),
-                neon.Reshape((self.num_actions, multistep_measurements_size))
-            ])(input_layer)
-            action_stream = action_stream_unnormalized - ng.mean(action_stream_unnormalized)
-
-        repeated_expectation_stream = ng.slice_along_axis(expectation_stream, expectation_stream.axes[0], 0)
-        repeated_expectation_stream = ng.expand_dims(repeated_expectation_stream, output_axis, 0)
-
-        # merge to future measurements predictions
-        self.output = repeated_expectation_stream + action_stream
-
diff --git a/architectures/neon_components/middleware.py b/architectures/neon_components/middleware.py
deleted file mode 100644
index 2aa02fd..0000000
--- a/architectures/neon_components/middleware.py
+++ /dev/null
@@ -1,50 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import ngraph as ng
-import ngraph.frontends.neon as neon
-from ngraph.util.names import name_scope
-import numpy as np
-
-
-class MiddlewareEmbedder(object):
-    def __init__(self, activation_function=neon.Rectlin(), name="middleware_embedder"):
-        self.name = name
-        self.input = None
-        self.output = None
-        self.weights_init = neon.GlorotInit()
-        self.biases_init = neon.ConstantInit()
-        self.activation_function = activation_function
-
-    def __call__(self, input_layer):
-        with name_scope(self.get_name()):
-            self.input = input_layer
-            self._build_module()
-
-        return self.input, self.output(self.input)
-
-    def _build_module(self):
-        pass
-
-    def get_name(self):
-        return self.name
-
-
-class FC_Embedder(MiddlewareEmbedder):
-    def _build_module(self):
-        self.output = neon.Sequential([
-                neon.Affine(nout=512, activation=self.activation_function,
-                            weight_init=self.weights_init, bias_init=self.biases_init)])
diff --git a/architectures/network_wrapper.py b/architectures/network_wrapper.py
deleted file mode 100644
index 7388587..0000000
--- a/architectures/network_wrapper.py
+++ /dev/null
@@ -1,187 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from collections import OrderedDict
-from configurations import Preset, Frameworks
-from logger import *
-try:
-    import tensorflow as tf
-    from architectures.tensorflow_components.general_network import GeneralTensorFlowNetwork
-except ImportError:
-    failed_imports.append("TensorFlow")
-
-try:
-    from architectures.neon_components.general_network import GeneralNeonNetwork
-except ImportError:
-    failed_imports.append("Neon")
-
-
-class NetworkWrapper(object):
-    """
-    Contains multiple networks and managers syncing and gradient updates
-    between them.
-    """
-    def __init__(self, tuning_parameters, has_target, has_global, name, replicated_device=None, worker_device=None):
-        """
-        :param tuning_parameters:
-        :type tuning_parameters: Preset
-        :param has_target:
-        :param has_global:
-        :param name:
-        :param replicated_device:
-        :param worker_device:
-        """
-        self.tp = tuning_parameters
-        self.has_target = has_target
-        self.has_global = has_global
-        self.name = name
-        self.sess = tuning_parameters.sess
-
-        if self.tp.framework == Frameworks.TensorFlow:
-            general_network = GeneralTensorFlowNetwork
-        elif self.tp.framework == Frameworks.Neon:
-            general_network = GeneralNeonNetwork
-        else:
-            raise Exception("{} Framework is not supported".format(Frameworks().to_string(self.tp.framework)))
-
-        # Global network - the main network shared between threads
-        self.global_network = None
-        if self.has_global:
-            with tf.device(replicated_device):
-                self.global_network = general_network(tuning_parameters, '{}/global'.format(name),
-                                                      network_is_local=False)
-
-        # Online network - local copy of the main network used for playing
-        self.online_network = None
-        with tf.device(worker_device):
-            self.online_network = general_network(tuning_parameters, '{}/online'.format(name),
-                                                  self.global_network, network_is_local=True)
-
-        # Target network - a local, slow updating network used for stabilizing the learning
-        self.target_network = None
-        if self.has_target:
-            with tf.device(worker_device):
-                self.target_network = general_network(tuning_parameters, '{}/target'.format(name),
-                                                      network_is_local=True)
-
-        if not self.tp.distributed and self.tp.framework == Frameworks.TensorFlow:
-            variables_to_restore = tf.global_variables()
-            variables_to_restore = [v for v in variables_to_restore if '/online' in v.name]
-            self.model_saver = tf.train.Saver(variables_to_restore)
-            #, max_to_keep=None) # uncomment to unlimit number of stored checkpoints
-            if self.tp.sess and self.tp.checkpoint_restore_dir:
-                checkpoint = tf.train.latest_checkpoint(self.tp.checkpoint_restore_dir)
-                screen.log_title("Loading checkpoint: {}".format(checkpoint))
-                self.model_saver.restore(self.tp.sess, checkpoint)
-                self.update_target_network()
-
-    def sync(self):
-        """
-        Initializes the weights of the networks to match each other
-        :return:
-        """
-        self.update_online_network()
-        self.update_target_network()
-
-    def update_target_network(self, rate=1.0):
-        """
-        Copy weights: online network >>> target network
-        :param rate: the rate of copying the weights - 1 for copying exactly
-        """
-        if self.target_network:
-            self.target_network.set_weights(self.online_network.get_weights(), rate)
-
-    def update_online_network(self, rate=1.0):
-        """
-        Copy weights: global network >>> online network
-        :param rate: the rate of copying the weights - 1 for copying exactly
-        """
-        if self.global_network:
-            self.online_network.set_weights(self.global_network.get_weights(), rate)
-
-    def apply_gradients_to_global_network(self):
-        """
-        Apply gradients from the online network on the global network
-        :return:
-        """
-        self.global_network.apply_gradients(self.online_network.accumulated_gradients)
-
-    def apply_gradients_to_online_network(self):
-        """
-        Apply gradients from the online network on itself
-        :return:
-        """
-        self.online_network.apply_gradients(self.online_network.accumulated_gradients)
-
-    def train_and_sync_networks(self, inputs, targets, additional_fetches=[]):
-        """
-        A generic training function that enables multi-threading training using a global network if necessary.
-        :param inputs: The inputs for the network.
-        :param targets: The targets corresponding to the given inputs
-        :param additional_fetches: Any additional tensor the user wants to fetch
-        :return: The loss of the training iteration
-        """
-        result = self.online_network.accumulate_gradients(inputs, targets, additional_fetches=additional_fetches)
-        self.apply_gradients_and_sync_networks()
-        return result
-
-    def apply_gradients_and_sync_networks(self):
-        """
-        Applies the gradients accumulated in the online network to the global network or to itself and syncs the
-        networks if necessary
-        """
-        if self.global_network:
-            self.apply_gradients_to_global_network()
-            self.online_network.reset_accumulated_gradients()
-            self.update_online_network()
-        else:
-            self.online_network.apply_and_reset_gradients(self.online_network.accumulated_gradients)
-
-    def get_local_variables(self):
-        """
-        Get all the variables that are local to the thread
-        :return: a list of all the variables that are local to the thread
-        """
-        local_variables = [v for v in tf.global_variables() if self.online_network.name in v.name]
-        if self.has_target:
-            local_variables += [v for v in tf.global_variables() if self.target_network.name in v.name]
-        return local_variables
-
-    def get_global_variables(self):
-        """
-        Get all the variables that are shared between threads
-        :return: a list of all the variables that are shared between threads
-        """
-        global_variables = [v for v in tf.global_variables() if self.global_network.name in v.name]
-        return global_variables
-
-    def set_session(self, sess):
-        self.sess = sess
-        self.online_network.sess = sess
-        if self.global_network:
-            self.global_network.sess = sess
-        if self.target_network:
-            self.target_network.sess = sess
-
-    def save_model(self, model_id):
-        saved_model_path = self.model_saver.save(self.tp.sess, os.path.join(self.tp.save_model_dir,
-                                                                        str(model_id) + '.ckpt'))
-        screen.log_dict(
-            OrderedDict([
-                ("Saving model", saved_model_path),
-            ]),
-            prefix="Checkpoint"
-        )
diff --git a/architectures/tensorflow_components/architecture.py b/architectures/tensorflow_components/architecture.py
deleted file mode 100644
index 006ed2c..0000000
--- a/architectures/tensorflow_components/architecture.py
+++ /dev/null
@@ -1,367 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-import time
-
-import numpy as np
-import tensorflow as tf
-
-from architectures.architecture import Architecture
-from utils import force_list, squeeze_list
-from configurations import Preset, MiddlewareTypes
-
-def variable_summaries(var):
-    """Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
-    with tf.name_scope('summaries'):
-        layer_weight_name = '_'.join(var.name.split('/')[-3:])[:-2]
-
-        with tf.name_scope(layer_weight_name):
-            mean = tf.reduce_mean(var)
-            tf.summary.scalar('mean', mean)
-            with tf.name_scope('stddev'):
-              stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
-            tf.summary.scalar('stddev', stddev)
-            tf.summary.scalar('max', tf.reduce_max(var))
-            tf.summary.scalar('min', tf.reduce_min(var))
-            tf.summary.histogram('histogram', var)
-
-class TensorFlowArchitecture(Architecture):
-    def __init__(self, tuning_parameters, name="", global_network=None, network_is_local=True):
-        """
-        :param tuning_parameters: The parameters used for running the algorithm
-        :type tuning_parameters: Preset
-        :param name: The name of the network
-        """
-        Architecture.__init__(self, tuning_parameters, name)
-        self.middleware_embedder = None
-        self.network_is_local = network_is_local
-        assert tuning_parameters.agent.tensorflow_support, 'TensorFlow is not supported for this agent'
-        self.sess = tuning_parameters.sess
-        self.inputs = {}
-        self.outputs = []
-        self.targets = []
-        self.losses = []
-        self.total_loss = None
-        self.trainable_weights = []
-        self.weights_placeholders = []
-        self.curr_rnn_c_in = None
-        self.curr_rnn_h_in = None
-        self.gradients_wrt_inputs = []
-        self.train_writer = None
-
-        self.optimizer_type = self.tp.agent.optimizer_type
-        if self.tp.seed is not None:
-            tf.set_random_seed(self.tp.seed)
-        with tf.variable_scope(self.name, initializer=tf.contrib.layers.xavier_initializer()):
-            self.global_step = tf.train.get_or_create_global_step()
-
-            # build the network
-            self.get_model(tuning_parameters)
-
-            # model weights
-            self.trainable_weights = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope=self.name)
-
-            # locks for synchronous training
-            if self.tp.distributed and not self.tp.agent.async_training and not self.network_is_local:
-                self.lock_counter = tf.get_variable("lock_counter", [], tf.int32,
-                                                    initializer=tf.constant_initializer(0, dtype=tf.int32),
-                                                    trainable=False)
-                self.lock = self.lock_counter.assign_add(1, use_locking=True)
-                self.lock_init = self.lock_counter.assign(0)
-
-                self.release_counter = tf.get_variable("release_counter", [], tf.int32,
-                                                       initializer=tf.constant_initializer(0, dtype=tf.int32),
-                                                       trainable=False)
-                self.release = self.release_counter.assign_add(1, use_locking=True)
-                self.release_init = self.release_counter.assign(0)
-
-            # local network does the optimization so we need to create all the ops we are going to use to optimize
-            for idx, var in enumerate(self.trainable_weights):
-                placeholder = tf.placeholder(tf.float32, shape=var.get_shape(), name=str(idx) + '_holder')
-                self.weights_placeholders.append(placeholder)
-                if self.tp.visualization.tensorboard:
-                    variable_summaries(var)
-
-            self.update_weights_from_list = [weights.assign(holder) for holder, weights in
-                                             zip(self.weights_placeholders, self.trainable_weights)]
-
-            # gradients ops
-            self.tensor_gradients = tf.gradients(self.total_loss, self.trainable_weights)
-            self.gradients_norm = tf.global_norm(self.tensor_gradients)
-            if self.tp.clip_gradients is not None and self.tp.clip_gradients != 0:
-                self.clipped_grads, self.grad_norms = tf.clip_by_global_norm(self.tensor_gradients,
-                                                                             tuning_parameters.clip_gradients)
-
-            # gradients of the outputs w.r.t. the inputs
-            # at the moment, this is only used by ddpg
-            if len(self.outputs) == 1:
-                self.gradients_wrt_inputs = {name: tf.gradients(self.outputs[0], input_ph) for name, input_ph in self.inputs.items()}
-                self.gradients_weights_ph = tf.placeholder('float32', self.outputs[0].shape, 'output_gradient_weights')
-                self.weighted_gradients = tf.gradients(self.outputs[0], self.trainable_weights, self.gradients_weights_ph)
-
-            # L2 regularization
-            if self.tp.agent.l2_regularization != 0:
-                self.l2_regularization = [tf.add_n([tf.nn.l2_loss(v) for v in self.trainable_weights])
-                                          * self.tp.agent.l2_regularization]
-                tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, self.l2_regularization)
-
-            self.inc_step = self.global_step.assign_add(1)
-
-            # defining the optimization process (for LBFGS we have less control over the optimizer)
-            if self.optimizer_type != 'LBFGS':
-                # no global network, this is a plain simple centralized training
-                self.update_weights_from_batch_gradients = self.optimizer.apply_gradients(
-                    zip(self.weights_placeholders, self.trainable_weights), global_step=self.global_step)
-
-            if self.tp.visualization.tensorboard:
-                current_scope_summaries = tf.get_collection(tf.GraphKeys.SUMMARIES,
-                                                            scope=tf.contrib.framework.get_name_scope())
-                self.merged = tf.summary.merge(current_scope_summaries)
-
-            # initialize or restore model
-            if not self.tp.distributed:
-                # Merge all the summaries
-
-                self.init_op = tf.global_variables_initializer()
-
-                if self.sess:
-                    if self.tp.visualization.tensorboard:
-                        # Write the merged summaries to the current experiment directory
-                        self.train_writer = tf.summary.FileWriter(self.tp.experiment_path + '/tensorboard',
-                                                                  self.sess.graph)
-                    self.sess.run(self.init_op)
-
-        self.accumulated_gradients = None
-
-    def reset_accumulated_gradients(self):
-        """
-        Reset the gradients accumulation placeholder
-        """
-        if self.accumulated_gradients is None:
-            self.accumulated_gradients = self.tp.sess.run(self.trainable_weights)
-
-        for ix, grad in enumerate(self.accumulated_gradients):
-            self.accumulated_gradients[ix] = grad * 0
-
-    def accumulate_gradients(self, inputs, targets, additional_fetches=None):
-        """
-        Runs a forward pass & backward pass, clips gradients if needed and accumulates them into the accumulation
-        placeholders
-        :param additional_fetches: Optional tensors to fetch during gradients calculation
-        :param inputs: The input batch for the network
-        :param targets: The targets corresponding to the input batch
-        :return: A list containing the total loss and the individual network heads losses
-        """
-
-        if self.accumulated_gradients is None:
-            self.reset_accumulated_gradients()
-
-        # feed inputs
-        if additional_fetches is None:
-            additional_fetches = []
-
-        feed_dict = self._feed_dict(inputs)
-
-        # feed targets
-        targets = force_list(targets)
-        for placeholder_idx, target in enumerate(targets):
-            feed_dict[self.targets[placeholder_idx]] = target
-
-        if self.optimizer_type != 'LBFGS':
-            # set the fetches
-            fetches = [self.gradients_norm]
-            if self.tp.clip_gradients:
-                fetches.append(self.clipped_grads)
-            else:
-                fetches.append(self.tensor_gradients)
-            fetches += [self.total_loss, self.losses]
-            if self.tp.agent.middleware_type == MiddlewareTypes.LSTM:
-                fetches.append(self.middleware_embedder.state_out)
-            additional_fetches_start_idx = len(fetches)
-            fetches += additional_fetches
-
-            # feed the lstm state if necessary
-            if self.tp.agent.middleware_type == MiddlewareTypes.LSTM:
-                # we can't always assume that we are starting from scratch here can we?
-                feed_dict[self.middleware_embedder.c_in] = self.middleware_embedder.c_init
-                feed_dict[self.middleware_embedder.h_in] = self.middleware_embedder.h_init
-
-            if self.tp.visualization.tensorboard:
-                fetches += [self.merged]
-
-            # get grads
-            result = self.tp.sess.run(fetches, feed_dict=feed_dict)
-            if hasattr(self, 'train_writer') and self.train_writer is not None:
-                self.train_writer.add_summary(result[-1], self.tp.current_episode)
-
-            # extract the fetches
-            norm_unclipped_grads, grads, total_loss, losses = result[:4]
-            if self.tp.agent.middleware_type == MiddlewareTypes.LSTM:
-                (self.curr_rnn_c_in, self.curr_rnn_h_in) = result[4]
-            fetched_tensors = []
-            if len(additional_fetches) > 0:
-                fetched_tensors = result[additional_fetches_start_idx:additional_fetches_start_idx +
-                                                                      len(additional_fetches)]
-
-            # accumulate the gradients
-            for idx, grad in enumerate(grads):
-                self.accumulated_gradients[idx] += grad
-
-            return total_loss, losses, norm_unclipped_grads, fetched_tensors
-
-        else:
-            self.optimizer.minimize(session=self.tp.sess, feed_dict=feed_dict)
-
-            return [0]
-
-    def apply_and_reset_gradients(self, gradients, scaler=1.):
-        """
-        Applies the given gradients to the network weights and resets the accumulation placeholder
-        :param gradients: The gradients to use for the update
-        :param scaler: A scaling factor that allows rescaling the gradients before applying them
-        """
-        self.apply_gradients(gradients, scaler)
-        self.reset_accumulated_gradients()
-
-    def apply_gradients(self, gradients, scaler=1.):
-        """
-        Applies the given gradients to the network weights
-        :param gradients: The gradients to use for the update
-        :param scaler: A scaling factor that allows rescaling the gradients before applying them
-        """
-        if self.tp.agent.async_training or not self.tp.distributed:
-            if hasattr(self, 'global_step') and not self.network_is_local:
-                self.tp.sess.run(self.inc_step)
-
-        if self.optimizer_type != 'LBFGS':
-
-            # lock barrier
-            if hasattr(self, 'lock_counter'):
-                self.tp.sess.run(self.lock)
-                while self.tp.sess.run(self.lock_counter) % self.tp.num_threads != 0:
-                    time.sleep(0.00001)
-                # rescale the gradients so that they average out with the gradients from the other workers
-                scaler /= float(self.tp.num_threads)
-
-            # apply gradients
-            if scaler != 1.:
-                for gradient in gradients:
-                    gradient /= scaler
-            feed_dict = dict(zip(self.weights_placeholders, gradients))
-            _ = self.tp.sess.run(self.update_weights_from_batch_gradients, feed_dict=feed_dict)
-
-            # release barrier
-            if hasattr(self, 'release_counter'):
-                self.tp.sess.run(self.release)
-                while self.tp.sess.run(self.release_counter) % self.tp.num_threads != 0:
-                    time.sleep(0.00001)
-
-    def _feed_dict(self, inputs):
-        feed_dict = {}
-        for input_name, input_value in inputs.items():
-            if isinstance(input_name, str):
-                if input_name not in self.inputs:
-                    raise ValueError((
-                        'input name {input_name} was provided to create a feed '
-                        'dictionary, but there is no placeholder with that name. '
-                        'placeholder names available include: {placeholder_names}'
-                    ).format(
-                        input_name=input_name,
-                        placeholder_names=', '.join(self.inputs.keys())
-                    ))
-
-                feed_dict[self.inputs[input_name]] = input_value
-            elif isinstance(input_name, tf.Tensor) and input_name.op.type == 'Placeholder':
-                feed_dict[input_name] = input_value
-            else:
-                raise ValueError((
-                    'input dictionary expects strings or placeholders as keys, '
-                    'but found key {key} of type {type}'
-                ).format(
-                    key=input_name,
-                    type=type(input_name),
-                ))
-
-        return feed_dict
-
-    def predict(self, inputs, outputs=None, squeeze_output=True):
-        """
-        Run a forward pass of the network using the given input
-        :param inputs: The input for the network
-        :param outputs: The output for the network, defaults to self.outputs
-        :param squeeze_output: call squeeze_list on output
-        :return: The network output
-
-        WARNING: must only call once per state since each call is assumed by LSTM to be a new time step.
-        """
-        feed_dict = self._feed_dict(inputs)
-        if outputs is None:
-            outputs = self.outputs
-
-        if self.tp.agent.middleware_type == MiddlewareTypes.LSTM:
-            feed_dict[self.middleware_embedder.c_in] = self.curr_rnn_c_in
-            feed_dict[self.middleware_embedder.h_in] = self.curr_rnn_h_in
-
-            output, (self.curr_rnn_c_in, self.curr_rnn_h_in) = self.tp.sess.run([outputs, self.middleware_embedder.state_out], feed_dict=feed_dict)
-        else:
-            output = self.tp.sess.run(outputs, feed_dict)
-
-        if squeeze_output:
-            output = squeeze_list(output)
-
-        return output
-
-    def get_weights(self):
-        """
-        :return: a list of tensors containing the network weights for each layer
-        """
-        return self.trainable_weights
-
-    def set_weights(self, weights, new_rate=1.0):
-        """
-        Sets the network weights from the given list of weights tensors
-        """
-        feed_dict = {}
-        old_weights, new_weights = self.tp.sess.run([self.get_weights(), weights])
-        for placeholder_idx, new_weight in enumerate(new_weights):
-            feed_dict[self.weights_placeholders[placeholder_idx]]\
-                = new_rate * new_weight + (1 - new_rate) * old_weights[placeholder_idx]
-        self.tp.sess.run(self.update_weights_from_list, feed_dict)
-
-    def write_graph_to_logdir(self, summary_dir):
-        """
-        Writes the tensorflow graph to the logdir for tensorboard visualization
-        :param summary_dir: the path to the logdir
-        """
-        summary_writer = tf.summary.FileWriter(summary_dir)
-        summary_writer.add_graph(self.sess.graph)
-
-    def get_variable_value(self, variable):
-        """
-        Get the value of a variable from the graph
-        :param variable: the variable
-        :return: the value of the variable
-        """
-        return self.sess.run(variable)
-
-    def set_variable_value(self, assign_op, value, placeholder=None):
-        """
-        Updates the value of a variable.
-        This requires having an assign operation for the variable, and a placeholder which will provide the value
-        :param assign_op: an assign operation for the variable
-        :param value: a value to set the variable to
-        :param placeholder: a placeholder to hold the given value for injecting it into the variable
-        """
-        self.sess.run(assign_op, feed_dict={placeholder: value})
diff --git a/architectures/tensorflow_components/embedders.py b/architectures/tensorflow_components/embedders.py
deleted file mode 100644
index b3f36cb..0000000
--- a/architectures/tensorflow_components/embedders.py
+++ /dev/null
@@ -1,144 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import tensorflow as tf
-from configurations import EmbedderDepth, EmbedderWidth
-
-
-class InputEmbedder(object):
-    def __init__(self, input_size, activation_function=tf.nn.relu,
-                 embedder_depth=EmbedderDepth.Shallow, embedder_width=EmbedderWidth.Wide,
-                 name="embedder"):
-        self.name = name
-        self.input_size = input_size
-        self.activation_function = activation_function
-        self.input = None
-        self.output = None
-        self.embedder_depth = embedder_depth
-        self.embedder_width = embedder_width
-
-    def __call__(self, prev_input_placeholder=None):
-        with tf.variable_scope(self.get_name()):
-            if prev_input_placeholder is None:
-                self.input = tf.placeholder("float", shape=(None,) + self.input_size, name=self.get_name())
-            else:
-                self.input = prev_input_placeholder
-            self._build_module()
-
-        return self.input, self.output
-
-    def _build_module(self):
-        pass
-
-    def get_name(self):
-        return self.name
-
-
-class ImageEmbedder(InputEmbedder):
-    def __init__(self, input_size, input_rescaler=255.0, activation_function=tf.nn.relu,
-                 embedder_depth=EmbedderDepth.Shallow, embedder_width=EmbedderWidth.Wide,
-                 name="embedder"):
-        InputEmbedder.__init__(self, input_size, activation_function, embedder_depth, embedder_width, name)
-        self.input_rescaler = input_rescaler
-
-    def _build_module(self):
-        # image observation
-        rescaled_observation_stack = self.input / self.input_rescaler
-
-        if self.embedder_depth == EmbedderDepth.Shallow:
-            # same embedder as used in the original DQN paper
-            self.observation_conv1 = tf.layers.conv2d(rescaled_observation_stack,
-                                                      filters=32, kernel_size=(8, 8), strides=(4, 4),
-                                                      activation=self.activation_function, data_format='channels_last',
-                                                      name='conv1')
-            self.observation_conv2 = tf.layers.conv2d(self.observation_conv1,
-                                                      filters=64, kernel_size=(4, 4), strides=(2, 2),
-                                                      activation=self.activation_function, data_format='channels_last',
-                                                      name='conv2')
-            self.observation_conv3 = tf.layers.conv2d(self.observation_conv2,
-                                                      filters=64, kernel_size=(3, 3), strides=(1, 1),
-                                                      activation=self.activation_function, data_format='channels_last',
-                                                      name='conv3'
-                                                      )
-
-            self.output = tf.contrib.layers.flatten(self.observation_conv3)
-
-        elif self.embedder_depth == EmbedderDepth.Deep:
-            # the embedder used in the CARLA papers
-            self.observation_conv1 = tf.layers.conv2d(rescaled_observation_stack,
-                                                 filters=32, kernel_size=(5, 5), strides=(2, 2),
-                                                 activation=self.activation_function, data_format='channels_last',
-                                                 name='conv1')
-            self.observation_conv2 = tf.layers.conv2d(self.observation_conv1,
-                                                 filters=32, kernel_size=(3, 3), strides=(1, 1),
-                                                 activation=self.activation_function, data_format='channels_last',
-                                                 name='conv2')
-            self.observation_conv3 = tf.layers.conv2d(self.observation_conv2,
-                                                 filters=64, kernel_size=(3, 3), strides=(2, 2),
-                                                 activation=self.activation_function, data_format='channels_last',
-                                                 name='conv3')
-            self.observation_conv4 = tf.layers.conv2d(self.observation_conv3,
-                                                 filters=64, kernel_size=(3, 3), strides=(1, 1),
-                                                 activation=self.activation_function, data_format='channels_last',
-                                                 name='conv4')
-            self.observation_conv5 = tf.layers.conv2d(self.observation_conv4,
-                                                 filters=128, kernel_size=(3, 3), strides=(2, 2),
-                                                 activation=self.activation_function, data_format='channels_last',
-                                                 name='conv5')
-            self.observation_conv6 = tf.layers.conv2d(self.observation_conv5,
-                                                 filters=128, kernel_size=(3, 3), strides=(1, 1),
-                                                 activation=self.activation_function, data_format='channels_last',
-                                                 name='conv6')
-            self.observation_conv7 = tf.layers.conv2d(self.observation_conv6,
-                                                 filters=256, kernel_size=(3, 3), strides=(2, 2),
-                                                 activation=self.activation_function, data_format='channels_last',
-                                                 name='conv7')
-            self.observation_conv8 = tf.layers.conv2d(self.observation_conv7,
-                                                 filters=256, kernel_size=(3, 3), strides=(1, 1),
-                                                 activation=self.activation_function, data_format='channels_last',
-                                                 name='conv8')
-
-            self.output = tf.contrib.layers.flatten(self.observation_conv8)
-        else:
-            raise ValueError("The defined embedder complexity value is invalid")
-
-
-class VectorEmbedder(InputEmbedder):
-    def __init__(self, input_size, activation_function=tf.nn.relu,
-                 embedder_depth=EmbedderDepth.Shallow, embedder_width=EmbedderWidth.Wide,
-                 name="embedder"):
-        InputEmbedder.__init__(self, input_size, activation_function, embedder_depth, embedder_width, name)
-
-    def _build_module(self):
-        # vector observation
-        input_layer = tf.contrib.layers.flatten(self.input)
-
-        width = 128 if self.embedder_width == EmbedderWidth.Wide else 32
-
-        if self.embedder_depth == EmbedderDepth.Shallow:
-            self.output = tf.layers.dense(input_layer, 2*width, activation=self.activation_function,
-                                                 name='fc1')
-
-        elif self.embedder_depth == EmbedderDepth.Deep:
-            # the embedder used in the CARLA papers
-            self.observation_fc1 = tf.layers.dense(input_layer, width, activation=self.activation_function,
-                                                 name='fc1')
-            self.observation_fc2 = tf.layers.dense(self.observation_fc1, width, activation=self.activation_function,
-                                                 name='fc2')
-            self.output = tf.layers.dense(self.observation_fc2, width, activation=self.activation_function,
-                                                 name='fc3')
-        else:
-            raise ValueError("The defined embedder complexity value is invalid")
diff --git a/architectures/tensorflow_components/general_network.py b/architectures/tensorflow_components/general_network.py
deleted file mode 100644
index a4e69ff..0000000
--- a/architectures/tensorflow_components/general_network.py
+++ /dev/null
@@ -1,206 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from architectures.tensorflow_components.embedders import *
-from architectures.tensorflow_components.heads import *
-from architectures.tensorflow_components.middleware import *
-from architectures.tensorflow_components.architecture import *
-from configurations import InputTypes, OutputTypes, MiddlewareTypes
-
-
-class GeneralTensorFlowNetwork(TensorFlowArchitecture):
-    """
-    A generalized version of all possible networks implemented using tensorflow.
-    """
-    def __init__(self, tuning_parameters, name="", global_network=None, network_is_local=True):
-        self.global_network = global_network
-        self.network_is_local = network_is_local
-        self.num_heads_per_network = 1 if tuning_parameters.agent.use_separate_networks_per_head else \
-            len(tuning_parameters.agent.output_types)
-        self.num_networks = 1 if not tuning_parameters.agent.use_separate_networks_per_head else \
-            len(tuning_parameters.agent.output_types)
-        self.input_embedders = []
-        self.output_heads = []
-        self.activation_function = self.get_activation_function(
-            tuning_parameters.agent.hidden_layers_activation_function)
-        self.embedder_width = tuning_parameters.agent.embedder_width
-
-        TensorFlowArchitecture.__init__(self, tuning_parameters, name, global_network, network_is_local)
-
-    def get_activation_function(self, activation_function_string):
-        activation_functions = {
-            'relu': tf.nn.relu,
-            'tanh': tf.nn.tanh,
-            'sigmoid': tf.nn.sigmoid,
-            'elu': tf.nn.elu,
-            'selu': tf.nn.selu,
-            'none': None
-        }
-        assert activation_function_string in activation_functions.keys(), \
-            "Activation function must be one of the following {}".format(activation_functions.keys())
-        return activation_functions[activation_function_string]
-
-    def get_input_embedder(self, embedder_type):
-        # the observation can be either an image or a vector
-        def get_observation_embedding(with_timestep=False):
-            if self.input_height > 1:
-                return ImageEmbedder((self.input_height, self.input_width, self.input_depth), name="observation",
-                                     input_rescaler=self.tp.agent.input_rescaler, embedder_width=self.embedder_width)
-            else:
-                return VectorEmbedder((self.input_width + int(with_timestep), self.input_depth), name="observation",
-                                      embedder_width=self.embedder_width)
-
-        input_mapping = {
-            InputTypes.Observation: get_observation_embedding(),
-            InputTypes.Measurements: VectorEmbedder(self.measurements_size, name="measurements",
-                                                    embedder_width=self.embedder_width),
-            InputTypes.GoalVector: VectorEmbedder(self.measurements_size, name="goal_vector",
-                                                  embedder_width=self.embedder_width),
-            InputTypes.Action: VectorEmbedder((self.num_actions,), name="action",
-                                              embedder_width=self.embedder_width),
-            InputTypes.TimedObservation: get_observation_embedding(with_timestep=True),
-        }
-        return input_mapping[embedder_type]
-
-    def get_middleware_embedder(self, middleware_type):
-        return {MiddlewareTypes.LSTM: LSTM_Embedder,
-                MiddlewareTypes.FC: FC_Embedder}.get(middleware_type)(self.activation_function, self.embedder_width)
-
-    def get_output_head(self, head_type, head_idx, loss_weight=1.):
-        output_mapping = {
-            OutputTypes.Q: QHead,
-            OutputTypes.DuelingQ: DuelingQHead,
-            OutputTypes.V: VHead,
-            OutputTypes.Pi: PolicyHead,
-            OutputTypes.MeasurementsPrediction: MeasurementsPredictionHead,
-            OutputTypes.DNDQ: DNDQHead,
-            OutputTypes.NAF: NAFHead,
-            OutputTypes.PPO: PPOHead,
-            OutputTypes.PPO_V: PPOVHead,
-            OutputTypes.CategoricalQ: CategoricalQHead,
-            OutputTypes.QuantileRegressionQ: QuantileRegressionQHead
-        }
-        return output_mapping[head_type](self.tp, head_idx, loss_weight, self.network_is_local)
-
-    def get_model(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        :return: A model
-        """
-        assert len(self.tp.agent.input_types) > 0, "At least one input type should be defined"
-        assert len(self.tp.agent.output_types) > 0, "At least one output type should be defined"
-        assert self.tp.agent.middleware_type is not None, "Exactly one middleware type should be defined"
-        assert len(self.tp.agent.loss_weights) > 0, "At least one loss weight should be defined"
-        assert len(self.tp.agent.output_types) == len(self.tp.agent.loss_weights), \
-            "Number of loss weights should match the number of output types"
-        local_network_in_distributed_training = self.global_network is not None and self.network_is_local
-
-        tuning_parameters.activation_function = self.activation_function
-
-        for network_idx in range(self.num_networks):
-            with tf.variable_scope('network_{}'.format(network_idx)):
-                ####################
-                # Input Embeddings #
-                ####################
-
-                state_embedding = []
-                for input_name, input_type in self.tp.agent.input_types.items():
-                    # get the class of the input embedder
-                    input_embedder = self.get_input_embedder(input_type)
-                    self.input_embedders.append(input_embedder)
-
-                    # input placeholders are reused between networks. on the first network, store the placeholders
-                    # generated by the input_embedders in self.inputs. on the rest of the networks, pass
-                    # the existing input_placeholders into the input_embedders.
-                    if network_idx == 0:
-                        input_placeholder, embedding = input_embedder()
-                        self.inputs[input_name] = input_placeholder
-                    else:
-                        input_placeholder, embedding = input_embedder(self.inputs[input_name])
-
-                    state_embedding.append(embedding)
-
-                ##############
-                # Middleware #
-                ##############
-
-                state_embedding = tf.concat(state_embedding, axis=-1) if len(state_embedding) > 1 else state_embedding[0]
-                self.middleware_embedder = self.get_middleware_embedder(self.tp.agent.middleware_type)
-                _, self.state_embedding = self.middleware_embedder(state_embedding)
-
-                ################
-                # Output Heads #
-                ################
-
-                for head_idx in range(self.num_heads_per_network):
-                    for head_copy_idx in range(self.tp.agent.num_output_head_copies):
-                        if self.tp.agent.use_separate_networks_per_head:
-                            # if we use separate networks per head, then the head type corresponds top the network idx
-                            head_type_idx = network_idx
-                        else:
-                            # if we use a single network with multiple heads, then the head type is the current head idx
-                            head_type_idx = head_idx
-                        self.output_heads.append(self.get_output_head(self.tp.agent.output_types[head_type_idx],
-                                                                      head_copy_idx,
-                                                                      self.tp.agent.loss_weights[head_type_idx]))
-
-                        if self.tp.agent.stop_gradients_from_head[head_idx]:
-                            head_input = tf.stop_gradient(self.state_embedding)
-                        else:
-                            head_input = self.state_embedding
-
-                        # build the head
-                        if self.network_is_local:
-                            output, target_placeholder, input_placeholders = self.output_heads[-1](head_input)
-                            self.targets.extend(target_placeholder)
-                        else:
-                            output, input_placeholders = self.output_heads[-1](head_input)
-
-                        self.outputs.extend(output)
-                        # TODO: use head names as well
-                        for placeholder_index, input_placeholder in enumerate(input_placeholders):
-                            self.inputs['output_{}_{}'.format(head_idx, placeholder_index)] = input_placeholder
-
-        # Losses
-        self.losses = tf.losses.get_losses(self.name)
-        self.losses += tf.losses.get_regularization_losses(self.name)
-        self.total_loss = tf.losses.compute_weighted_loss(self.losses, scope=self.name)
-        if self.tp.visualization.tensorboard:
-            tf.summary.scalar('total_loss', self.total_loss)
-
-
-        # Learning rate
-        if self.tp.learning_rate_decay_rate != 0:
-            self.tp.learning_rate = tf.train.exponential_decay(
-                self.tp.learning_rate, self.global_step, decay_steps=self.tp.learning_rate_decay_steps,
-                decay_rate=self.tp.learning_rate_decay_rate, staircase=True)
-
-        # Optimizer
-        if local_network_in_distributed_training and \
-                hasattr(self.tp.agent, "shared_optimizer") and self.tp.agent.shared_optimizer:
-            # distributed training and this is the local network instantiation
-            self.optimizer = self.global_network.optimizer
-        else:
-            if tuning_parameters.agent.optimizer_type == 'Adam':
-                self.optimizer = tf.train.AdamOptimizer(learning_rate=tuning_parameters.learning_rate)
-            elif tuning_parameters.agent.optimizer_type == 'RMSProp':
-                self.optimizer = tf.train.RMSPropOptimizer(tuning_parameters.learning_rate, decay=0.9, epsilon=0.01)
-            elif tuning_parameters.agent.optimizer_type == 'LBFGS':
-                self.optimizer = tf.contrib.opt.ScipyOptimizerInterface(self.total_loss, method='L-BFGS-B',
-                                                                        options={'maxiter': 25})
-            else:
-                raise Exception("{} is not a valid optimizer type".format(tuning_parameters.agent.optimizer_type))
diff --git a/architectures/tensorflow_components/heads.py b/architectures/tensorflow_components/heads.py
deleted file mode 100644
index b463d7f..0000000
--- a/architectures/tensorflow_components/heads.py
+++ /dev/null
@@ -1,558 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import tensorflow as tf
-import numpy as np
-from utils import force_list
-
-
-# Used to initialize weights for policy and value output layers
-def normalized_columns_initializer(std=1.0):
-    def _initializer(shape, dtype=None, partition_info=None):
-        out = np.random.randn(*shape).astype(np.float32)
-        out *= std / np.sqrt(np.square(out).sum(axis=0, keepdims=True))
-        return tf.constant(out)
-    return _initializer
-
-
-class Head(object):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        self.head_idx = head_idx
-        self.name = "head"
-        self.output = []
-        self.loss = []
-        self.loss_type = []
-        self.regularizations = []
-        self.loss_weight = force_list(loss_weight)
-        self.target = []
-        self.input = []
-        self.is_local = is_local
-
-    def __call__(self, input_layer):
-        """
-        Wrapper for building the module graph including scoping and loss creation
-        :param input_layer: the input to the graph
-        :return: the output of the last layer and the target placeholder
-        """
-        with tf.variable_scope(self.get_name(), initializer=tf.contrib.layers.xavier_initializer()):
-            self._build_module(input_layer)
-
-            self.output = force_list(self.output)
-            self.target = force_list(self.target)
-            self.input = force_list(self.input)
-            self.loss_type = force_list(self.loss_type)
-            self.loss = force_list(self.loss)
-            self.regularizations = force_list(self.regularizations)
-            if self.is_local:
-                self.set_loss()
-            self._post_build()
-
-        if self.is_local:
-            return self.output, self.target, self.input
-        else:
-            return self.output, self.input
-
-    def _build_module(self, input_layer):
-        """
-        Builds the graph of the module
-
-        This method is called early on from __call__. It is expected to store the graph
-        in self.output.
-
-        :param input_layer: the input to the graph
-        :return: None
-        """
-        pass
-
-    def _post_build(self):
-        """
-        Optional function that allows adding any extra definitions after the head has been fully defined
-        For example, this allows doing additional calculations that are based on the loss
-        :return: None
-        """
-        pass
-
-    def get_name(self):
-        """
-        Get a formatted name for the module
-        :return: the formatted name
-        """
-        return '{}_{}'.format(self.name, self.head_idx)
-
-    def set_loss(self):
-        """
-        Creates a target placeholder and loss function for each loss_type and regularization
-        :param loss_type: a tensorflow loss function
-        :param scope: the name scope to include the tensors in
-        :return: None
-        """
-        # add losses and target placeholder
-        for idx in range(len(self.loss_type)):
-            target = tf.placeholder('float', self.output[idx].shape, '{}_target'.format(self.get_name()))
-            self.target.append(target)
-            loss = self.loss_type[idx](self.target[-1], self.output[idx],
-                                       weights=self.loss_weight[idx], scope=self.get_name())
-            self.loss.append(loss)
-
-        # add regularizations
-        for regularization in self.regularizations:
-            self.loss.append(regularization)
-
-
-class QHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'q_values_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        if tuning_parameters.agent.replace_mse_with_huber_loss:
-            self.loss_type = tf.losses.huber_loss
-        else:
-            self.loss_type = tf.losses.mean_squared_error
-
-    def _build_module(self, input_layer):
-        # Standard Q Network
-        self.output = tf.layers.dense(input_layer, self.num_actions, name='output')
-
-
-class DuelingQHead(QHead):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        QHead.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-
-    def _build_module(self, input_layer):
-        # state value tower - V
-        with tf.variable_scope("state_value"):
-            state_value = tf.layers.dense(input_layer, 256, activation=tf.nn.relu, name='fc1')
-            state_value = tf.layers.dense(state_value, 1, name='fc2')
-            # state_value = tf.expand_dims(state_value, axis=-1)
-
-        # action advantage tower - A
-        with tf.variable_scope("action_advantage"):
-            action_advantage = tf.layers.dense(input_layer, 256, activation=tf.nn.relu, name='fc1')
-            action_advantage = tf.layers.dense(action_advantage, self.num_actions, name='fc2')
-            action_advantage = action_advantage - tf.reduce_mean(action_advantage)
-
-        # merge to state-action value function Q
-        self.output = tf.add(state_value, action_advantage, name='output')
-
-
-class VHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'v_values_head'
-        if tuning_parameters.agent.replace_mse_with_huber_loss:
-            self.loss_type = tf.losses.huber_loss
-        else:
-            self.loss_type = tf.losses.mean_squared_error
-
-    def _build_module(self, input_layer):
-        # Standard V Network
-        self.output = tf.layers.dense(input_layer, 1, name='output',
-                                            kernel_initializer=normalized_columns_initializer(1.0))
-
-
-class PolicyHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'policy_values_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        self.output_scale = np.max(tuning_parameters.env_instance.action_space_abs_range)
-        self.discrete_controls = tuning_parameters.env_instance.discrete_controls
-        self.exploration_policy = tuning_parameters.exploration.policy
-        self.exploration_variance = 2*self.output_scale*tuning_parameters.exploration.initial_noise_variance_percentage
-        if not self.discrete_controls and not self.output_scale:
-            raise ValueError("For continuous controls, an output scale for the network must be specified")
-        self.beta = tuning_parameters.agent.beta_entropy
-
-    def _build_module(self, input_layer):
-        eps = 1e-15
-        if self.discrete_controls:
-            self.actions = tf.placeholder(tf.int32, [None], name="actions")
-        else:
-            self.actions = tf.placeholder(tf.float32, [None, self.num_actions], name="actions")
-        self.input = [self.actions]
-
-        # Policy Head
-        if self.discrete_controls:
-            policy_values = tf.layers.dense(input_layer, self.num_actions, name='fc')
-            self.policy_mean = tf.nn.softmax(policy_values, name="policy")
-
-            # define the distributions for the policy and the old policy
-            # (the + eps is to prevent probability 0 which will cause the log later on to be -inf)
-            self.policy_distribution = tf.contrib.distributions.Categorical(probs=(self.policy_mean + eps))
-            self.output = self.policy_mean
-        else:
-            # mean
-            policy_values_mean = tf.layers.dense(input_layer, self.num_actions, activation=tf.nn.tanh, name='fc_mean')
-            self.policy_mean = tf.multiply(policy_values_mean, self.output_scale, name='output_mean')
-
-            self.output = [self.policy_mean]
-
-            # std
-            if self.exploration_policy == 'ContinuousEntropy':
-                policy_values_std = tf.layers.dense(input_layer, self.num_actions,
-                                            kernel_initializer=normalized_columns_initializer(0.01), name='fc_std')
-                self.policy_std = tf.nn.softplus(policy_values_std, name='output_variance') + eps
-
-                self.output.append(self.policy_std)
-
-            else:
-                self.policy_std = tf.constant(self.exploration_variance, dtype='float32', shape=(self.num_actions,))
-
-            # define the distributions for the policy and the old policy
-            self.policy_distribution = tf.contrib.distributions.MultivariateNormalDiag(self.policy_mean,
-                                                                                       self.policy_std)
-
-        if self.is_local:
-            # add entropy regularization
-            if self.beta:
-                self.entropy = tf.reduce_mean(self.policy_distribution.entropy())
-                self.regularizations = -tf.multiply(self.beta, self.entropy, name='entropy_regularization')
-                tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, self.regularizations)
-
-            # calculate loss
-            self.action_log_probs_wrt_policy = self.policy_distribution.log_prob(self.actions)
-            self.advantages = tf.placeholder(tf.float32, [None], name="advantages")
-            self.target = self.advantages
-            self.loss = -tf.reduce_mean(self.action_log_probs_wrt_policy * self.advantages)
-            tf.losses.add_loss(self.loss_weight[0] * self.loss)
-
-
-class MeasurementsPredictionHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'future_measurements_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        self.num_measurements = tuning_parameters.env.measurements_size[0] \
-            if tuning_parameters.env.measurements_size else 0
-        self.num_prediction_steps = tuning_parameters.agent.num_predicted_steps_ahead
-        self.multi_step_measurements_size = self.num_measurements * self.num_prediction_steps
-        if tuning_parameters.agent.replace_mse_with_huber_loss:
-            self.loss_type = tf.losses.huber_loss
-        else:
-            self.loss_type = tf.losses.mean_squared_error
-
-    def _build_module(self, input_layer):
-        # This is almost exactly the same as Dueling Network but we predict the future measurements for each action
-        # actions expectation tower (expectation stream) - E
-        with tf.variable_scope("expectation_stream"):
-            expectation_stream = tf.layers.dense(input_layer, 256, activation=tf.nn.elu, name='fc1')
-            expectation_stream = tf.layers.dense(expectation_stream, self.multi_step_measurements_size, name='output')
-            expectation_stream = tf.expand_dims(expectation_stream, axis=1)
-
-        # action fine differences tower (action stream) - A
-        with tf.variable_scope("action_stream"):
-            action_stream = tf.layers.dense(input_layer, 256, activation=tf.nn.elu, name='fc1')
-            action_stream = tf.layers.dense(action_stream, self.num_actions * self.multi_step_measurements_size,
-                                            name='output')
-            action_stream = tf.reshape(action_stream,
-                                       (tf.shape(action_stream)[0], self.num_actions, self.multi_step_measurements_size))
-            action_stream = action_stream - tf.reduce_mean(action_stream, reduction_indices=1, keep_dims=True)
-
-        # merge to future measurements predictions
-        self.output = tf.add(expectation_stream, action_stream, name='output')
-
-
-class DNDQHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'dnd_q_values_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        self.DND_size = tuning_parameters.agent.dnd_size
-        self.DND_key_error_threshold = tuning_parameters.agent.DND_key_error_threshold
-        self.l2_norm_added_delta = tuning_parameters.agent.l2_norm_added_delta
-        self.new_value_shift_coefficient = tuning_parameters.agent.new_value_shift_coefficient
-        self.number_of_nn = tuning_parameters.agent.number_of_knn
-        if tuning_parameters.agent.replace_mse_with_huber_loss:
-            self.loss_type = tf.losses.huber_loss
-        else:
-            self.loss_type = tf.losses.mean_squared_error
-        self.tp = tuning_parameters
-        self.dnd_embeddings = [None]*self.num_actions
-        self.dnd_values = [None]*self.num_actions
-        self.dnd_indices = [None]*self.num_actions
-
-    def _build_module(self, input_layer):
-        # DND based Q head
-        from memories import differentiable_neural_dictionary
-
-        if self.tp.checkpoint_restore_dir:
-            self.DND = differentiable_neural_dictionary.load_dnd(self.tp.checkpoint_restore_dir)
-        else:
-            self.DND = differentiable_neural_dictionary.QDND(
-                self.DND_size, input_layer.get_shape()[-1], self.num_actions, self.new_value_shift_coefficient,
-                key_error_threshold=self.DND_key_error_threshold, learning_rate=self.tp.learning_rate)
-
-        # Retrieve info from DND dictionary
-        # We assume that all actions have enough entries in the DND
-        self.output = tf.transpose([
-            self._q_value(input_layer, action)
-            for action in range(self.num_actions)
-        ])
-
-    def _q_value(self, input_layer, action):
-        result = tf.py_func(self.DND.query,
-                            [input_layer, action, self.number_of_nn],
-                            [tf.float64, tf.float64, tf.int64])
-        self.dnd_embeddings[action] = tf.to_float(result[0])
-        self.dnd_values[action] = tf.to_float(result[1])
-        self.dnd_indices[action] = result[2]
-
-        # DND calculation
-        square_diff = tf.square(self.dnd_embeddings[action] - tf.expand_dims(input_layer, 1))
-        distances = tf.reduce_sum(square_diff, axis=2) + [self.l2_norm_added_delta]
-        weights = 1.0 / distances
-        normalised_weights = weights / tf.reduce_sum(weights, axis=1, keep_dims=True)
-        return tf.reduce_sum(self.dnd_values[action] * normalised_weights, axis=1)
-
-
-class NAFHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'naf_q_values_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        self.output_scale = np.max(tuning_parameters.env_instance.action_space_abs_range)
-        if tuning_parameters.agent.replace_mse_with_huber_loss:
-            self.loss_type = tf.losses.huber_loss
-        else:
-            self.loss_type = tf.losses.mean_squared_error
-
-    def _build_module(self, input_layer):
-        # NAF
-        self.action = tf.placeholder(tf.float32, [None, self.num_actions], name="action")
-        self.input = self.action
-
-        # V Head
-        self.V = tf.layers.dense(input_layer, 1, name='V')
-
-        # mu Head
-        mu_unscaled = tf.layers.dense(input_layer, self.num_actions, activation=tf.nn.tanh, name='mu_unscaled')
-        self.mu = tf.multiply(mu_unscaled, self.output_scale, name='mu')
-
-        # A Head
-        # l_vector is a vector that includes a lower-triangular matrix values
-        self.l_vector = tf.layers.dense(input_layer, (self.num_actions * (self.num_actions + 1)) / 2, name='l_vector')
-
-        # Convert l to a lower triangular matrix and exponentiate its diagonal
-
-        i = 0
-        columns = []
-        for col in range(self.num_actions):
-            start_row = col
-            num_non_zero_elements = self.num_actions - start_row
-            zeros_column_part = tf.zeros_like(self.l_vector[:, 0:start_row])
-            diag_element = tf.expand_dims(tf.exp(self.l_vector[:, i]), 1)
-            non_zeros_non_diag_column_part = self.l_vector[:, (i + 1):(i + num_non_zero_elements)]
-            columns.append(tf.concat([zeros_column_part, diag_element, non_zeros_non_diag_column_part], axis=1))
-            i += num_non_zero_elements
-        self.L = tf.transpose(tf.stack(columns, axis=1), (0, 2, 1))
-
-        # P = L*L^T
-        self.P = tf.matmul(self.L, tf.transpose(self.L, (0, 2, 1)))
-
-        # A = -1/2 * (u - mu)^T * P * (u - mu)
-        action_diff = tf.expand_dims(self.action - self.mu, -1)
-        a_matrix_form = -0.5 * tf.matmul(tf.transpose(action_diff, (0, 2, 1)), tf.matmul(self.P, action_diff))
-        self.A = tf.reshape(a_matrix_form, [-1, 1])
-
-        # Q Head
-        self.Q = tf.add(self.V, self.A, name='Q')
-
-        self.output = self.Q
-
-
-class PPOHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'ppo_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        self.discrete_controls = tuning_parameters.env_instance.discrete_controls
-        self.output_scale = np.max(tuning_parameters.env_instance.action_space_abs_range)
-
-        # kl coefficient and its corresponding assignment operation and placeholder
-        self.kl_coefficient = tf.Variable(tuning_parameters.agent.initial_kl_coefficient,
-                                          trainable=False, name='kl_coefficient')
-        self.kl_coefficient_ph = tf.placeholder('float', name='kl_coefficient_ph')
-        self.assign_kl_coefficient = tf.assign(self.kl_coefficient, self.kl_coefficient_ph)
-
-        self.kl_cutoff = 2*tuning_parameters.agent.target_kl_divergence
-        self.high_kl_penalty_coefficient = tuning_parameters.agent.high_kl_penalty_coefficient
-        self.clip_likelihood_ratio_using_epsilon = tuning_parameters.agent.clip_likelihood_ratio_using_epsilon
-        self.use_kl_regularization = tuning_parameters.agent.use_kl_regularization
-        self.beta = tuning_parameters.agent.beta_entropy
-
-    def _build_module(self, input_layer):
-        eps = 1e-15
-        if self.discrete_controls:
-            self.actions = tf.placeholder(tf.int32, [None], name="actions")
-        else:
-            self.actions = tf.placeholder(tf.float32, [None, self.num_actions], name="actions")
-        self.old_policy_mean = tf.placeholder(tf.float32, [None, self.num_actions], "old_policy_mean")
-        self.old_policy_std = tf.placeholder(tf.float32, [None, self.num_actions], "old_policy_std")
-
-        # Policy Head
-        if self.discrete_controls:
-            self.input = [self.actions, self.old_policy_mean]
-            policy_values = tf.layers.dense(input_layer, self.num_actions, name='policy_fc')
-            self.policy_mean = tf.nn.softmax(policy_values, name="policy")
-
-            # define the distributions for the policy and the old policy
-            self.policy_distribution = tf.contrib.distributions.Categorical(probs=(self.policy_mean + eps))
-            self.old_policy_distribution = tf.contrib.distributions.Categorical(probs=self.old_policy_mean)
-
-            self.output = self.policy_mean
-        else:
-            self.input = [self.actions, self.old_policy_mean, self.old_policy_std]
-            self.policy_mean = tf.layers.dense(input_layer, self.num_actions, name='policy_mean')
-            self.policy_logstd = tf.Variable(np.zeros((1, self.num_actions)), dtype='float32')
-            self.policy_std = tf.tile(tf.exp(self.policy_logstd), [tf.shape(input_layer)[0], 1], name='policy_std')
-
-            # define the distributions for the policy and the old policy
-            self.policy_distribution = tf.contrib.distributions.MultivariateNormalDiag(self.policy_mean,
-                                                                                       self.policy_std)
-            self.old_policy_distribution = tf.contrib.distributions.MultivariateNormalDiag(self.old_policy_mean,
-                                                                                           self.old_policy_std)
-
-            self.output = [self.policy_mean, self.policy_std]
-
-        self.action_probs_wrt_policy = tf.exp(self.policy_distribution.log_prob(self.actions))
-        self.action_probs_wrt_old_policy = tf.exp(self.old_policy_distribution.log_prob(self.actions))
-        self.entropy = tf.reduce_mean(self.policy_distribution.entropy())
-
-        # add kl divergence regularization
-        self.kl_divergence = tf.reduce_mean(tf.contrib.distributions.kl_divergence(self.old_policy_distribution,
-                                                                        self.policy_distribution))
-        if self.use_kl_regularization:
-            # no clipping => use kl regularization
-            self.weighted_kl_divergence = tf.multiply(self.kl_coefficient, self.kl_divergence)
-            self.regularizations = self.weighted_kl_divergence + self.high_kl_penalty_coefficient * \
-                                                            tf.square(tf.maximum(0.0, self.kl_divergence - self.kl_cutoff))
-            tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, self.regularizations)
-
-        # calculate surrogate loss
-        self.advantages = tf.placeholder(tf.float32, [None], name="advantages")
-        self.target = self.advantages
-        self.likelihood_ratio = self.action_probs_wrt_policy / (self.action_probs_wrt_old_policy + eps)
-        if self.clip_likelihood_ratio_using_epsilon is not None:
-            max_value = 1 + self.clip_likelihood_ratio_using_epsilon
-            min_value = 1 - self.clip_likelihood_ratio_using_epsilon
-            self.clipped_likelihood_ratio = tf.clip_by_value(self.likelihood_ratio, min_value, max_value)
-            self.scaled_advantages = tf.minimum(self.likelihood_ratio * self.advantages,
-                                                self.clipped_likelihood_ratio * self.advantages)
-        else:
-            self.scaled_advantages = self.likelihood_ratio * self.advantages
-        # minus sign is in order to set an objective to minimize (we actually strive for maximizing the surrogate loss)
-        self.surrogate_loss = -tf.reduce_mean(self.scaled_advantages)
-        if self.is_local:
-            # add entropy regularization
-            if self.beta:
-                self.entropy = tf.reduce_mean(self.policy_distribution.entropy())
-                self.regularizations = -tf.multiply(self.beta, self.entropy, name='entropy_regularization')
-                tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, self.regularizations)
-
-        self.loss = self.surrogate_loss
-        tf.losses.add_loss(self.loss)
-
-
-class PPOVHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'ppo_v_head'
-        self.clip_likelihood_ratio_using_epsilon = tuning_parameters.agent.clip_likelihood_ratio_using_epsilon
-
-    def _build_module(self, input_layer):
-        self.old_policy_value = tf.placeholder(tf.float32, [None], "old_policy_values")
-        self.input = [self.old_policy_value]
-        self.output = tf.layers.dense(input_layer, 1, name='output',
-                                            kernel_initializer=normalized_columns_initializer(1.0))
-        self.target = self.total_return = tf.placeholder(tf.float32, [None], name="total_return")
-
-        value_loss_1 = tf.square(self.output - self.target)
-        value_loss_2 = tf.square(self.old_policy_value +
-                                 tf.clip_by_value(self.output - self.old_policy_value,
-                                                  -self.clip_likelihood_ratio_using_epsilon,
-                                                  self.clip_likelihood_ratio_using_epsilon) - self.target)
-        self.vf_loss = tf.reduce_mean(tf.maximum(value_loss_1, value_loss_2))
-        self.loss = self.vf_loss
-        tf.losses.add_loss(self.loss)
-
-
-class CategoricalQHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'categorical_dqn_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        self.num_atoms = tuning_parameters.agent.atoms
-
-    def _build_module(self, input_layer):
-        self.actions = tf.placeholder(tf.int32, [None], name="actions")
-        self.input = [self.actions]
-
-        values_distribution = tf.layers.dense(input_layer, self.num_actions * self.num_atoms, name='output')
-        values_distribution = tf.reshape(values_distribution, (tf.shape(values_distribution)[0], self.num_actions, self.num_atoms))
-        # softmax on atoms dimension
-        self.output = tf.nn.softmax(values_distribution)
-
-        # calculate cross entropy loss
-        self.distributions = tf.placeholder(tf.float32, shape=(None, self.num_actions, self.num_atoms), name="distributions")
-        self.target = self.distributions
-        self.loss = tf.nn.softmax_cross_entropy_with_logits(labels=self.target, logits=values_distribution)
-        tf.losses.add_loss(self.loss)
-
-
-class QuantileRegressionQHead(Head):
-    def __init__(self, tuning_parameters, head_idx=0, loss_weight=1., is_local=True):
-        Head.__init__(self, tuning_parameters, head_idx, loss_weight, is_local)
-        self.name = 'quantile_regression_dqn_head'
-        self.num_actions = tuning_parameters.env_instance.action_space_size
-        self.num_atoms = tuning_parameters.agent.atoms  # we use atom / quantile interchangeably
-        self.huber_loss_interval = 1  # k
-
-    def _build_module(self, input_layer):
-        self.actions = tf.placeholder(tf.int32, [None, 2], name="actions")
-        self.quantile_midpoints = tf.placeholder(tf.float32, [None, self.num_atoms], name="quantile_midpoints")
-        self.input = [self.actions, self.quantile_midpoints]
-
-        # the output of the head is the N unordered quantile locations {theta_1, ..., theta_N}
-        quantiles_locations = tf.layers.dense(input_layer, self.num_actions * self.num_atoms, name='output')
-        quantiles_locations = tf.reshape(quantiles_locations, (tf.shape(quantiles_locations)[0], self.num_actions, self.num_atoms))
-        self.output = quantiles_locations
-
-        self.quantiles = tf.placeholder(tf.float32, shape=(None, self.num_atoms), name="quantiles")
-        self.target = self.quantiles
-
-        # only the quantiles of the taken action are taken into account
-        quantiles_for_used_actions = tf.gather_nd(quantiles_locations, self.actions)
-
-        # reorder the output quantiles and the target quantiles as a preparation step for calculating the loss
-        # the output quantiles vector and the quantile midpoints are tiled as rows of a NxN matrix (N = num quantiles)
-        # the target quantiles vector is tiled as column of a NxN matrix
-        theta_i = tf.tile(tf.expand_dims(quantiles_for_used_actions, -1), [1, 1, self.num_atoms])
-        T_theta_j = tf.tile(tf.expand_dims(self.target, -2), [1, self.num_atoms, 1])
-        tau_i = tf.tile(tf.expand_dims(self.quantile_midpoints, -1), [1, 1, self.num_atoms])
-
-        # Huber loss of T(theta_j) - theta_i
-        error = T_theta_j - theta_i
-        abs_error = tf.abs(error)
-        quadratic = tf.minimum(abs_error, self.huber_loss_interval)
-        huber_loss = self.huber_loss_interval * (abs_error - quadratic) + 0.5 * quadratic ** 2
-
-        # Quantile Huber loss
-        quantile_huber_loss = tf.abs(tau_i - tf.cast(error < 0, dtype=tf.float32)) * huber_loss
-
-        # Quantile regression loss (the probability for each quantile is 1/num_quantiles)
-        quantile_regression_loss = tf.reduce_sum(quantile_huber_loss) / float(self.num_atoms)
-        self.loss = quantile_regression_loss
-        tf.losses.add_loss(self.loss)
diff --git a/architectures/tensorflow_components/middleware.py b/architectures/tensorflow_components/middleware.py
deleted file mode 100644
index eee5925..0000000
--- a/architectures/tensorflow_components/middleware.py
+++ /dev/null
@@ -1,77 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import tensorflow as tf
-import numpy as np
-from configurations import EmbedderWidth
-
-
-class MiddlewareEmbedder(object):
-    def __init__(self, activation_function=tf.nn.relu, embedder_width=EmbedderWidth.Wide, name="middleware_embedder"):
-        self.name = name
-        self.input = None
-        self.output = None
-        self.embedder_width = embedder_width
-        self.activation_function = activation_function
-
-    def __call__(self, input_layer):
-        with tf.variable_scope(self.get_name()):
-            self.input = input_layer
-            self._build_module()
-
-        return self.input, self.output
-
-    def _build_module(self):
-        pass
-
-    def get_name(self):
-        return self.name
-
-
-class LSTM_Embedder(MiddlewareEmbedder):
-    def _build_module(self):
-        """
-        self.state_in: tuple of placeholders containing the initial state
-        self.state_out: tuple of output state
-
-        todo: it appears that the shape of the output is batch, feature
-        the code here seems to be slicing off the first element in the batch
-        which would definitely be wrong. need to double check the shape
-        """
-
-        middleware = tf.layers.dense(self.input, 512, activation=self.activation_function, name='fc1')
-        lstm_cell = tf.contrib.rnn.BasicLSTMCell(256, state_is_tuple=True)
-        self.c_init = np.zeros((1, lstm_cell.state_size.c), np.float32)
-        self.h_init = np.zeros((1, lstm_cell.state_size.h), np.float32)
-        self.state_init = [self.c_init, self.h_init]
-        self.c_in = tf.placeholder(tf.float32, [1, lstm_cell.state_size.c])
-        self.h_in = tf.placeholder(tf.float32, [1, lstm_cell.state_size.h])
-        self.state_in = (self.c_in, self.h_in)
-        rnn_in = tf.expand_dims(middleware, [0])
-        step_size = tf.shape(middleware)[:1]
-        state_in = tf.contrib.rnn.LSTMStateTuple(self.c_in, self.h_in)
-        lstm_outputs, lstm_state = tf.nn.dynamic_rnn(
-            lstm_cell, rnn_in, initial_state=state_in, sequence_length=step_size, time_major=False)
-        lstm_c, lstm_h = lstm_state
-        self.state_out = (lstm_c[:1, :], lstm_h[:1, :])
-        self.output = tf.reshape(lstm_outputs, [-1, 256])
-
-
-class FC_Embedder(MiddlewareEmbedder):
-    def _build_module(self):
-        width = 512 if self.embedder_width == EmbedderWidth.Wide else 64
-        self.output = tf.layers.dense(self.input, width, activation=self.activation_function, name='fc1')
-
diff --git a/architectures/tensorflow_components/shared_variables.py b/architectures/tensorflow_components/shared_variables.py
deleted file mode 100644
index 2775251..0000000
--- a/architectures/tensorflow_components/shared_variables.py
+++ /dev/null
@@ -1,82 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import tensorflow as tf
-import numpy as np
-
-
-class SharedRunningStats(object):
-    def __init__(self, tuning_parameters, replicated_device, epsilon=1e-2, shape=(), name=""):
-        self.tp = tuning_parameters
-        with tf.device(replicated_device):
-            with tf.variable_scope(name):
-                self._sum = tf.get_variable(
-                    dtype=tf.float64,
-                    shape=shape,
-                    initializer=tf.constant_initializer(0.0),
-                    name="running_sum", trainable=False)
-                self._sum_squared = tf.get_variable(
-                    dtype=tf.float64,
-                    shape=shape,
-                    initializer=tf.constant_initializer(epsilon),
-                    name="running_sum_squared", trainable=False)
-                self._count = tf.get_variable(
-                    dtype=tf.float64,
-                    shape=(),
-                    initializer=tf.constant_initializer(epsilon),
-                    name="count", trainable=False)
-
-                self._shape = shape
-                self._mean = self._sum / self._count
-                self._std = tf.sqrt(tf.maximum((self._sum_squared - self._count*tf.square(self._mean))
-                                               / tf.maximum(self._count-1, 1), epsilon))
-
-                self.new_sum = tf.placeholder(shape=self.shape, dtype=tf.float64, name='sum')
-                self.new_sum_squared = tf.placeholder(shape=self.shape, dtype=tf.float64, name='var')
-                self.newcount = tf.placeholder(shape=[], dtype=tf.float64, name='count')
-
-                self._inc_sum = tf.assign_add(self._sum, self.new_sum, use_locking=True)
-                self._inc_sum_squared = tf.assign_add(self._sum_squared, self.new_sum_squared, use_locking=True)
-                self._inc_count = tf.assign_add(self._count, self.newcount, use_locking=True)
-
-    def push(self, x):
-        x = x.astype('float64')
-        self.tp.sess.run([self._inc_sum, self._inc_sum_squared, self._inc_count],
-                         feed_dict={
-                             self.new_sum: x.sum(axis=0).ravel(),
-                             self.new_sum_squared: np.square(x).sum(axis=0).ravel(),
-                             self.newcount: np.array(len(x), dtype='float64')
-                         })
-
-    @property
-    def n(self):
-        return self.tp.sess.run(self._count)
-
-    @property
-    def mean(self):
-        return self.tp.sess.run(self._mean)
-
-    @property
-    def var(self):
-        return self.std ** 2
-
-    @property
-    def std(self):
-        return self.tp.sess.run(self._std)
-
-    @property
-    def shape(self):
-        return self._shape
\ No newline at end of file
diff --git a/benchmarks/README.md b/benchmarks/README.md
index ba237e7..0c00442 100644
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -1,172 +1,44 @@
 # Coach Benchmarks
 
-The following figures are training curves of some of the presets available through Coach.
-The X axis in all the figures is the total steps (for multi-threaded runs, this is the accumulated number of steps over all the workers).
-The Y axis in all the figures is the average episode reward with an averaging window of 11 episodes.
+The following table represents the current status of algorithms implemented in Coach relative to the results reported in the original papers. The detailed results for each algorithm can be seen by clicking on its name.
+
+The X axis in all the figures is the total steps (for multi-threaded runs, this is the number of steps per worker).
+The Y axis in all the figures is the average episode reward with an averaging window of 100 timesteps.
+
+For each algorithm, there is a command line for reproducing the results of each graph.
 These are the results you can expect to get when running the pre-defined presets in Coach.
 
+The environments that were used for testing include:
+* **Atari** - Breakout, Pong and Space Invaders
+* **Mujoco** - Inverted Pendulum, Inverted Double Pendulum, Reacher, Hopper, Half Cheetah, Walker 2D, Ant, Swimmer and Humanoid.
+* **Doom** - Basic, Health Gathering (D1: Basic), Health Gathering Supreme (D2: Navigation), Battle (D3: Battle)
+* **Fetch** - Reach, Slide, Push, Pick-and-Place
 
-## A3C
+## Summary
 
-### Breakout_A3C with 16 workers
+![#2E8B57](https://placehold.it/15/2E8B57/000000?text=+) *Reproducing paper's results*
 
-```bash
-python3 coach.py -p Breakout_A3C -n 16 -r
-```
+![#ceffad](https://placehold.it/15/ceffad/000000?text=+) *Reproducing paper's results for some of the environments*
 
-<img src="img/Breakout_A3C_16_workers.png" alt="Breakout_A3C_16_workers" width="400"/>
+![#FFA500](https://placehold.it/15/FFA500/000000?text=+) *Training but not reproducing paper's results*
 
-### InvertedPendulum_A3C with 16 workers
+![#FF4040](https://placehold.it/15/FF4040/000000?text=+) *Not training*
 
-```bash
-python3 coach.py -p InvertedPendulum_A3C -n 16 -r
-```
 
-<img src="img/Inverted_Pendulum_A3C_16_workers.png" alt="Inverted_Pendulum_A3C_16_workers" width="400"/>
+|                         |**Status**                                                |**Environments**|**Comments**|
+| ----------------------- |:--------------------------------------------------------:|:--------------:|:--------:|
+|**[DQN](dqn)**                  | ![#ceffad](https://placehold.it/15/ceffad/000000?text=+) |Atari           | Pong is not training |
+|**[Dueling DDQN](dueling_ddqn)**| ![#ceffad](https://placehold.it/15/ceffad/000000?text=+) |Atari           | Pong is not training |
+|**[Dueling DDQN with PER](dueling_ddqn_with_per)**| ![#2E8B57](https://placehold.it/15/2E8B57/000000?text=+) |Atari           | |
+|**[Bootstrapped DQN](bootstrapped_dqn)**| ![#2E8B57](https://placehold.it/15/2E8B57/000000?text=+) |Atari           | |
+|**[QR-DQN](qr_dqn)**            | ![#2E8B57](https://placehold.it/15/2E8B57/000000?text=+) |Atari           | |
+|**[A3C](a3c)**                  | ![#2E8B57](https://placehold.it/15/2E8B57/000000?text=+) |Atari, Mujoco   | |
+|**[Clipped PPO](clipped_ppo)**  | ![#2E8B57](https://placehold.it/15/2E8B57/000000?text=+) |Mujoco          | |
+|**[DDPG](ddpg)**                | ![#2E8B57](https://placehold.it/15/2E8B57/000000?text=+) |Mujoco          | |
+|**[NEC](nec)**                  | ![#2E8B57](https://placehold.it/15/2E8B57/000000?text=+) |Atari           | |
+|**[HER](ddpg_her)**                  | ![#2E8B57](https://placehold.it/15/2E8B57/000000?text=+) |Fetch           | |
+|**[HAC](hac)**                  | ![#969696](https://placehold.it/15/969696/000000?text=+) |Pendulum        | |
+|**[DFP](dfp)**                  | ![#ceffad](https://placehold.it/15/ceffad/000000?text=+) |Doom            | Doom Battle was not verified |
 
-### Hopper_A3C with 16 workers
 
-```bash
-python3 coach.py -p Hopper_A3C -n 16 -r
-```
-
-<img src="img/Hopper_A3C_16_workers.png" alt="Hopper_A3C_16_workers" width="400"/>
-
-### Ant_A3C with 16 workers
-
-```bash
-python3 coach.py -p Ant_A3C -n 16 -r
-```
-
-<img src="img/Ant_A3C_16_workers.png" alt="Ant_A3C_16_workers" width="400"/>
-
-## Clipped PPO
-
-### InvertedPendulum_ClippedPPO with 16 workers
-
-```bash
-python3 coach.py -p InvertedPendulum_ClippedPPO -n 16 -r
-```
-
-<img src="img/InvertedPendulum_ClippedPPO_16_workers.png" alt="InvertedPendulum_ClippedPPO_16_workers" width="400"/>
-
-### Hopper_ClippedPPO with 16 workers
-
-```bash
-python3 coach.py -p Hopper_ClippedPPO -n 16 -r
-```
-
-<img src="img/Hopper_ClippedPPO_16_workers.png" alt="Hopper_Clipped_PPO_16_workers" width="400"/>
-
-### Humanoid_ClippedPPO with 16 workers
-
-```bash
-python3 coach.py -p Humanoid_ClippedPPO -n 16 -r
-```
-
-<img src="img/Humanoid_ClippedPPO_16_workers.png" alt="Humanoid_ClippedPPO_16_workers" width="400"/>
-
-## DQN
-
-### Pong_DQN
-
-```bash
-python3 coach.py -p Pong_DQN -r
-```
-
-<img src="img/Pong_DQN.png" alt="Pong_DQN" width="400"/>
-
-### Doom_Basic_DQN
-
-```bash
-python3 coach.py -p Doom_Basic_DQN -r
-```
-
-<img src="img/Doom_Basic_DQN.png" alt="Doom_Basic_DQN" width="400"/>
-
-## Dueling DDQN
-
-### Doom_Basic_Dueling_DDQN
-
-```bash
-python3 coach.py -p Doom_Basic_Dueling_DDQN -r
-```
-
-<img src="img/Doom_Basic_Dueling_DDQN.png" alt="Doom_Basic_Dueling_DDQN" width="400"/>
-
-## DFP
-
-### Doom_Health_DFP
-
-```bash
-python3 coach.py -p Doom_Health_DFP -r
-```
-
-<img src="img/Doom_Health_DFP.png" alt="Doom_Health_DFP" width="400"/>
-
-## MMC
-
-### Doom_Health_MMC
-
-```bash
-python3 coach.py -p Doom_Health_MMC -r
-```
-
-<img src="img/Doom_Health_MMC.png" alt="Doom_Health_MMC" width="400"/>
-
-## NEC
-
-## Pong_NEC
-
-```bash
-python3 coach.py -p Pong_NEC -r
-```
-
-<img src="img/Pong_NEC.png" alt="Pong_NEC" width="400"/>
-
-## Doom_Basic_NEC
-
-```bash
-python3 coach.py -p Doom_Basic_NEC -r
-```
-
-<img src="img/Doom_Basic_NEC.png" alt="Doom_Basic_NEC" width="400"/>
-
-## PG
-
-### CartPole_PG
-
-```bash
-python3 coach.py -p CartPole_PG -r
-```
-
-<img src="img/CartPole_PG.png" alt="CartPole_PG" width="400"/>
-
-## DDPG
-
-### Pendulum_DDPG
-
-```bash
-python3 coach.py -p Pendulum_DDPG -r
-```
-
-<img src="img/Pendulum_DDPG.png" alt="Pendulum_DDPG" width="400"/>
-
-
-## NAF
-
-### InvertedPendulum_NAF
-
-```bash
-python3 coach.py -p InvertedPendulum_NAF -r
-```
-
-<img src="img/InvertedPendulum_NAF.png" alt="InvertedPendulum_NAF" width="400"/>
-
-### Pendulum_NAF
-
-```bash
-python3 coach.py -p Pendulum_NAF -r
-```
-
-<img src="img/Pendulum_NAF.png" alt="Pendulum_NAF" width="400"/>
+**Click on each algorithm to see detailed benchmarking results**
diff --git a/benchmarks/a3c/README.md b/benchmarks/a3c/README.md
new file mode 100644
index 0000000..8fde621
--- /dev/null
+++ b/benchmarks/a3c/README.md
@@ -0,0 +1,43 @@
+# A3C
+
+Each experiment uses 3 seeds.
+The parameters used for Clipped PPO are the same parameters as described in the [original paper](https://arxiv.org/abs/1707.06347).
+
+### Inverted Pendulum A3C - 1/2/4/8/16 workers
+
+```bash
+python3 coach.py -p Mujoco_A3C -lvl inverted_pendulum -n 1
+python3 coach.py -p Mujoco_A3C -lvl inverted_pendulum -n 2
+python3 coach.py -p Mujoco_A3C -lvl inverted_pendulum -n 4
+python3 coach.py -p Mujoco_A3C -lvl inverted_pendulum -n 8
+python3 coach.py -p Mujoco_A3C -lvl inverted_pendulum -n 16
+```
+
+<img src="inverted_pendulum_a3c.png" alt="Inverted Pendulum A3C" width="800"/>
+
+
+### Hopper A3C - 16 workers
+
+```bash
+python3 coach.py -p Mujoco_A3C -lvl hopper -n 16
+```
+
+<img src="hopper_a3c_16_workers.png" alt="Hopper A3C 16 workers" width="800"/>
+
+
+### Walker2D A3C - 16 workers
+
+```bash
+python3 coach.py -p Mujoco_A3C -lvl walker2d -n 16
+```
+
+<img src="walker2d_a3c_16_workers.png" alt="Walker2D A3C 16 workers" width="800"/>
+
+
+### Space Invaders A3C - 16 workers
+
+```bash
+python3 coach.py -p Atari_A3C -lvl space_invaders -n 16
+```
+
+<img src="space_invaders_a3c_16_workers.png" alt="Space Invaders A3C 16 workers" width="800"/>
diff --git a/benchmarks/a3c/hopper_a3c_16_workers.png b/benchmarks/a3c/hopper_a3c_16_workers.png
new file mode 100644
index 0000000..4607f6a
Binary files /dev/null and b/benchmarks/a3c/hopper_a3c_16_workers.png differ
diff --git a/benchmarks/a3c/inverted_pendulum_a3c.png b/benchmarks/a3c/inverted_pendulum_a3c.png
new file mode 100644
index 0000000..65b1720
Binary files /dev/null and b/benchmarks/a3c/inverted_pendulum_a3c.png differ
diff --git a/benchmarks/a3c/space_invaders_a3c_16_workers.png b/benchmarks/a3c/space_invaders_a3c_16_workers.png
new file mode 100644
index 0000000..9208f89
Binary files /dev/null and b/benchmarks/a3c/space_invaders_a3c_16_workers.png differ
diff --git a/benchmarks/a3c/walker2d_a3c_16_workers.png b/benchmarks/a3c/walker2d_a3c_16_workers.png
new file mode 100644
index 0000000..a003359
Binary files /dev/null and b/benchmarks/a3c/walker2d_a3c_16_workers.png differ
diff --git a/benchmarks/bootstrapped_dqn/README.md b/benchmarks/bootstrapped_dqn/README.md
new file mode 100644
index 0000000..8a5f059
--- /dev/null
+++ b/benchmarks/bootstrapped_dqn/README.md
@@ -0,0 +1,31 @@
+# Bootstrapped DQN
+
+Each experiment uses 3 seeds.
+The parameters used for Bootstrapped DQN are the same parameters as described in the [original paper](https://arxiv.org/abs/1602.04621.pdf).
+
+### Breakout Bootstrapped DQN - single worker
+
+```bash
+python3 coach.py -p Atari_Bootstrapped_DQN -lvl breakout
+```
+
+<img src="breakout_bootstrapped_dqn.png" alt="Breakout Bootstrapped DQN" width="800"/>
+
+
+### Pong Bootstrapped DQN - single worker
+
+```bash
+python3 coach.py -p Atari_Bootstrapped_DQN -lvl pong
+```
+
+<img src="pong_bootstrapped_dqn.png" alt="Pong Bootstrapped DQN" width="800"/>
+
+
+### Space Invaders Bootstrapped DQN - single worker
+
+```bash
+python3 coach.py -p Atari_Bootstrapped_DQN -lvl space_invaders
+```
+
+<img src="space_invaders_bootstrapped_dqn.png" alt="Space Invaders Bootstrapped DQN" width="800"/>
+
diff --git a/benchmarks/bootstrapped_dqn/breakout_bootstrapped_dqn.png b/benchmarks/bootstrapped_dqn/breakout_bootstrapped_dqn.png
new file mode 100644
index 0000000..b38b6fc
Binary files /dev/null and b/benchmarks/bootstrapped_dqn/breakout_bootstrapped_dqn.png differ
diff --git a/benchmarks/bootstrapped_dqn/pong_bootstrapped_dqn.png b/benchmarks/bootstrapped_dqn/pong_bootstrapped_dqn.png
new file mode 100644
index 0000000..af7ca76
Binary files /dev/null and b/benchmarks/bootstrapped_dqn/pong_bootstrapped_dqn.png differ
diff --git a/benchmarks/bootstrapped_dqn/space_invaders_bootstrapped_dqn.png b/benchmarks/bootstrapped_dqn/space_invaders_bootstrapped_dqn.png
new file mode 100644
index 0000000..1494f40
Binary files /dev/null and b/benchmarks/bootstrapped_dqn/space_invaders_bootstrapped_dqn.png differ
diff --git a/benchmarks/clipped_ppo/README.md b/benchmarks/clipped_ppo/README.md
new file mode 100644
index 0000000..00f2766
--- /dev/null
+++ b/benchmarks/clipped_ppo/README.md
@@ -0,0 +1,84 @@
+# Clipped PPO
+
+Each experiment uses 3 seeds and is trained for 10k environment steps.
+The parameters used for Clipped PPO are the same parameters as described in the [original paper](https://arxiv.org/abs/1707.06347).
+
+### Inverted Pendulum Clipped PPO - single worker
+
+```bash
+python3 coach.py -p Mujoco_ClippedPPO -lvl inverted_pendulum
+```
+
+<img src="inverted_pendulum_clipped_ppo.png" alt="Inverted Pendulum Clipped PPO" width="800"/>
+
+
+### Inverted Double Pendulum Clipped PPO - single worker
+
+```bash
+python3 coach.py -p Mujoco_ClippedPPO -lvl inverted_double_pendulum
+```
+
+<img src="inverted_double_pendulum_clipped_ppo.png" alt="Inverted Double Pendulum Clipped PPO" width="800"/>
+
+
+### Reacher Clipped PPO - single worker
+
+```bash
+python3 coach.py -p Mujoco_ClippedPPO -lvl reacher
+```
+
+<img src="reacher_clipped_ppo.png" alt="Reacher Clipped PPO" width="800"/>
+
+
+### Hopper Clipped PPO - single worker
+
+```bash
+python3 coach.py -p Mujoco_ClippedPPO -lvl hopper
+```
+
+<img src="hopper_clipped_ppo.png" alt="Hopper Clipped PPO" width="800"/>
+
+
+### Half Cheetah Clipped PPO - single worker
+
+```bash
+python3 coach.py -p Mujoco_ClippedPPO -lvl half_cheetah
+```
+
+<img src="half_cheetah_clipped_ppo.png" alt="Half Cheetah Clipped PPO" width="800"/>
+
+
+### Walker 2D Clipped PPO - single worker
+
+```bash
+python3 coach.py -p Mujoco_ClippedPPO -lvl walker2d
+```
+
+<img src="walker2d_clipped_ppo.png" alt="Walker 2D Clipped PPO" width="800"/>
+
+
+### Ant Clipped PPO - single worker
+
+```bash
+python3 coach.py -p Mujoco_ClippedPPO -lvl ant
+```
+
+<img src="ant_clipped_ppo.png" alt="Ant Clipped PPO" width="800"/>
+
+
+### Swimmer Clipped PPO - single worker
+
+```bash
+python3 coach.py -p Mujoco_ClippedPPO -lvl swimmer
+```
+
+<img src="swimmer_clipped_ppo.png" alt="Swimmer Clipped PPO" width="800"/>
+
+
+### Humanoid Clipped PPO - single worker
+
+```bash
+python3 coach.py -p Mujoco_ClippedPPO -lvl humanoid
+```
+
+<img src="humanoid_clipped_ppo.png" alt="Humanoid Clipped PPO" width="800"/>
diff --git a/benchmarks/clipped_ppo/ant_clipped_ppo.png b/benchmarks/clipped_ppo/ant_clipped_ppo.png
new file mode 100644
index 0000000..d500180
Binary files /dev/null and b/benchmarks/clipped_ppo/ant_clipped_ppo.png differ
diff --git a/benchmarks/clipped_ppo/half_cheetah_clipped_ppo.png b/benchmarks/clipped_ppo/half_cheetah_clipped_ppo.png
new file mode 100644
index 0000000..fc4c5b9
Binary files /dev/null and b/benchmarks/clipped_ppo/half_cheetah_clipped_ppo.png differ
diff --git a/benchmarks/clipped_ppo/hopper_clipped_ppo.png b/benchmarks/clipped_ppo/hopper_clipped_ppo.png
new file mode 100644
index 0000000..79cc2bf
Binary files /dev/null and b/benchmarks/clipped_ppo/hopper_clipped_ppo.png differ
diff --git a/benchmarks/clipped_ppo/humanoid_clipped_ppo.png b/benchmarks/clipped_ppo/humanoid_clipped_ppo.png
new file mode 100644
index 0000000..1612430
Binary files /dev/null and b/benchmarks/clipped_ppo/humanoid_clipped_ppo.png differ
diff --git a/benchmarks/clipped_ppo/inverted_double_pendulum_clipped_ppo.png b/benchmarks/clipped_ppo/inverted_double_pendulum_clipped_ppo.png
new file mode 100644
index 0000000..6473460
Binary files /dev/null and b/benchmarks/clipped_ppo/inverted_double_pendulum_clipped_ppo.png differ
diff --git a/benchmarks/clipped_ppo/inverted_pendulum_clipped_ppo.png b/benchmarks/clipped_ppo/inverted_pendulum_clipped_ppo.png
new file mode 100644
index 0000000..0302d17
Binary files /dev/null and b/benchmarks/clipped_ppo/inverted_pendulum_clipped_ppo.png differ
diff --git a/benchmarks/clipped_ppo/reacher_clipped_ppo.png b/benchmarks/clipped_ppo/reacher_clipped_ppo.png
new file mode 100644
index 0000000..d58e3e6
Binary files /dev/null and b/benchmarks/clipped_ppo/reacher_clipped_ppo.png differ
diff --git a/benchmarks/clipped_ppo/swimmer_clipped_ppo.png b/benchmarks/clipped_ppo/swimmer_clipped_ppo.png
new file mode 100644
index 0000000..7fd0e8f
Binary files /dev/null and b/benchmarks/clipped_ppo/swimmer_clipped_ppo.png differ
diff --git a/benchmarks/clipped_ppo/walker2d_clipped_ppo.png b/benchmarks/clipped_ppo/walker2d_clipped_ppo.png
new file mode 100644
index 0000000..3150b70
Binary files /dev/null and b/benchmarks/clipped_ppo/walker2d_clipped_ppo.png differ
diff --git a/benchmarks/ddpg/README.md b/benchmarks/ddpg/README.md
new file mode 100644
index 0000000..f10fa0e
--- /dev/null
+++ b/benchmarks/ddpg/README.md
@@ -0,0 +1,84 @@
+# DDPG
+
+Each experiment uses 3 seeds and is trained for 2k environment steps.
+The parameters used for DDPG are the same parameters as described in the [original paper](https://arxiv.org/abs/1509.02971).
+
+### Inverted Pendulum DDPG - single worker
+
+```bash
+python3 coach.py -p Mujoco_DDPG -lvl inverted_pendulum
+```
+
+<img src="inverted_pendulum_ddpg.png" alt="Inverted Pendulum DDPG" width="800"/>
+
+
+### Inverted Double Pendulum DDPG - single worker
+
+```bash
+python3 coach.py -p Mujoco_DDPG -lvl inverted_double_pendulum
+```
+
+<img src="inverted_double_pendulum_ddpg.png" alt="Inverted Double Pendulum DDPG" width="800"/>
+
+
+### Reacher DDPG - single worker
+
+```bash
+python3 coach.py -p Mujoco_DDPG -lvl reacher
+```
+
+<img src="reacher_ddpg.png" alt="Reacher DDPG" width="800"/>
+
+
+### Hopper DDPG - single worker
+
+```bash
+python3 coach.py -p Mujoco_DDPG -lvl hopper
+```
+
+<img src="hopper_ddpg.png" alt="Hopper DDPG" width="800"/>
+
+
+### Half Cheetah DDPG - single worker
+
+```bash
+python3 coach.py -p Mujoco_DDPG -lvl half_cheetah
+```
+
+<img src="half_cheetah_ddpg.png" alt="Half Cheetah DDPG" width="800"/>
+
+
+### Walker 2D DDPG - single worker
+
+```bash
+python3 coach.py -p Mujoco_DDPG -lvl walker2d
+```
+
+<img src="walker2d_ddpg.png" alt="Walker 2D DDPG" width="800"/>
+
+
+### Ant DDPG - single worker
+
+```bash
+python3 coach.py -p Mujoco_DDPG -lvl ant
+```
+
+<img src="ant_ddpg.png" alt="Ant DDPG" width="800"/>
+
+
+### Swimmer DDPG - single worker
+
+```bash
+python3 coach.py -p Mujoco_DDPG -lvl swimmer
+```
+
+<img src="swimmer_ddpg.png" alt="Swimmer DDPG" width="800"/>
+
+
+### Humanoid DDPG - single worker
+
+```bash
+python3 coach.py -p Mujoco_DDPG -lvl humanoid
+```
+
+<img src="humanoid_ddpg.png" alt="Humanoid DDPG" width="800"/>
diff --git a/benchmarks/ddpg/ant_ddpg.png b/benchmarks/ddpg/ant_ddpg.png
new file mode 100644
index 0000000..61678c1
Binary files /dev/null and b/benchmarks/ddpg/ant_ddpg.png differ
diff --git a/benchmarks/ddpg/half_cheetah_ddpg.png b/benchmarks/ddpg/half_cheetah_ddpg.png
new file mode 100644
index 0000000..9b6689f
Binary files /dev/null and b/benchmarks/ddpg/half_cheetah_ddpg.png differ
diff --git a/benchmarks/ddpg/hopper_ddpg.png b/benchmarks/ddpg/hopper_ddpg.png
new file mode 100644
index 0000000..18061be
Binary files /dev/null and b/benchmarks/ddpg/hopper_ddpg.png differ
diff --git a/benchmarks/ddpg/humanoid_ddpg.png b/benchmarks/ddpg/humanoid_ddpg.png
new file mode 100644
index 0000000..ba73d2f
Binary files /dev/null and b/benchmarks/ddpg/humanoid_ddpg.png differ
diff --git a/benchmarks/ddpg/inverted_double_pendulum_ddpg.png b/benchmarks/ddpg/inverted_double_pendulum_ddpg.png
new file mode 100644
index 0000000..519da9e
Binary files /dev/null and b/benchmarks/ddpg/inverted_double_pendulum_ddpg.png differ
diff --git a/benchmarks/ddpg/inverted_pendulum_ddpg.png b/benchmarks/ddpg/inverted_pendulum_ddpg.png
new file mode 100644
index 0000000..bd064a8
Binary files /dev/null and b/benchmarks/ddpg/inverted_pendulum_ddpg.png differ
diff --git a/benchmarks/ddpg/reacher_ddpg.png b/benchmarks/ddpg/reacher_ddpg.png
new file mode 100644
index 0000000..114d9cd
Binary files /dev/null and b/benchmarks/ddpg/reacher_ddpg.png differ
diff --git a/benchmarks/ddpg/swimmer_ddpg.png b/benchmarks/ddpg/swimmer_ddpg.png
new file mode 100644
index 0000000..3e04fd7
Binary files /dev/null and b/benchmarks/ddpg/swimmer_ddpg.png differ
diff --git a/benchmarks/ddpg/walker2d_ddpg.png b/benchmarks/ddpg/walker2d_ddpg.png
new file mode 100644
index 0000000..50efd3c
Binary files /dev/null and b/benchmarks/ddpg/walker2d_ddpg.png differ
diff --git a/benchmarks/ddpg_her/README.md b/benchmarks/ddpg_her/README.md
new file mode 100644
index 0000000..6dfdc57
--- /dev/null
+++ b/benchmarks/ddpg_her/README.md
@@ -0,0 +1,40 @@
+# DDPG with Hindsight Experience Replay
+
+Each experiment uses 3 seeds.
+The parameters used for DDPG HER are the same parameters as described in the [following paper](https://arxiv.org/abs/1802.09464).
+
+### Fetch Reach DDPG HER - single worker
+
+```bash
+python3 coach.py -p Fetch_DDPG_HER_baselines -lvl reach
+```
+
+<img src="fetch_ddpg_her_reach_1_worker.png" alt="Fetch DDPG HER Reach 1 Worker" width="800"/>
+
+
+### Fetch Push DDPG HER - 8 workers
+
+```bash
+python3 coach.py -p Fetch_DDPG_HER_baselines -lvl push -n 8
+```
+
+<img src="fetch_ddpg_her_push_8_workers.png" alt="Fetch DDPG HER Push 8 Worker" width="800"/>
+
+
+### Fetch Slide DDPG HER - 8 workers
+
+```bash
+python3 coach.py -p Fetch_DDPG_HER_baselines -lvl slide -n 8
+```
+
+<img src="fetch_ddpg_her_slide_8_workers.png" alt="Fetch DDPG HER Slide 8 Worker" width="800"/>
+
+
+### Fetch Pick And Place DDPG HER - 8 workers
+
+```bash
+python3 coach.py -p Fetch_DDPG_HER -lvl pick_and_place -n 8
+```
+
+<img src="fetch_ddpg_her_pick_and_place_8_workers.png" alt="Fetch DDPG HER Pick And Place 8 Workers" width="800"/>
+
diff --git a/benchmarks/ddpg_her/fetch_ddpg_her_pick_and_place_8_workers.png b/benchmarks/ddpg_her/fetch_ddpg_her_pick_and_place_8_workers.png
new file mode 100644
index 0000000..59b7138
Binary files /dev/null and b/benchmarks/ddpg_her/fetch_ddpg_her_pick_and_place_8_workers.png differ
diff --git a/benchmarks/ddpg_her/fetch_ddpg_her_push_8_workers.png b/benchmarks/ddpg_her/fetch_ddpg_her_push_8_workers.png
new file mode 100644
index 0000000..8c088ad
Binary files /dev/null and b/benchmarks/ddpg_her/fetch_ddpg_her_push_8_workers.png differ
diff --git a/benchmarks/ddpg_her/fetch_ddpg_her_reach_1_worker.png b/benchmarks/ddpg_her/fetch_ddpg_her_reach_1_worker.png
new file mode 100644
index 0000000..df0139c
Binary files /dev/null and b/benchmarks/ddpg_her/fetch_ddpg_her_reach_1_worker.png differ
diff --git a/benchmarks/ddpg_her/fetch_ddpg_her_slide_8_workers.png b/benchmarks/ddpg_her/fetch_ddpg_her_slide_8_workers.png
new file mode 100644
index 0000000..d3d7623
Binary files /dev/null and b/benchmarks/ddpg_her/fetch_ddpg_her_slide_8_workers.png differ
diff --git a/benchmarks/dfp/README.md b/benchmarks/dfp/README.md
new file mode 100644
index 0000000..01ed6ae
--- /dev/null
+++ b/benchmarks/dfp/README.md
@@ -0,0 +1,31 @@
+# DFP
+
+Each experiment uses 3 seeds.
+The parameters used for DFP are the same parameters as described in the [original paper](https://arxiv.org/abs/1611.01779).
+
+### Doom Basic DFP - 8 workers
+
+```bash
+python3 coach.py -p Doom_Basic_DFP -n 8
+```
+
+<img src="doom_basic_dfp_8_workers.png" alt="Doom Basic DFP 8 workers" width="800"/>
+
+
+### Doom Health (D1: Basic) DFP - 8 workers
+
+```bash
+python3 coach.py -p Doom_Health_DFP -n 8
+```
+
+<img src="doom_health_dfp_8_workers.png" alt="Doom Health DFP 8 workers" width="800"/>
+
+
+
+### Doom Health Supreme (D2: Navigation) DFP - 8 workers
+
+```bash
+python3 coach.py -p Doom_Health_Supreme_DFP -n 8
+```
+
+<img src="doom_health_supreme_dfp_8_workers.png" alt="Doom Health Supreme DFP 8 workers" width="800"/>
diff --git a/benchmarks/dfp/doom_basic_dfp_8_workers.png b/benchmarks/dfp/doom_basic_dfp_8_workers.png
new file mode 100644
index 0000000..88369d5
Binary files /dev/null and b/benchmarks/dfp/doom_basic_dfp_8_workers.png differ
diff --git a/benchmarks/dfp/doom_health_dfp_8_workers.png b/benchmarks/dfp/doom_health_dfp_8_workers.png
new file mode 100644
index 0000000..bd448b3
Binary files /dev/null and b/benchmarks/dfp/doom_health_dfp_8_workers.png differ
diff --git a/benchmarks/dfp/doom_health_supreme_dfp_8_workers.png b/benchmarks/dfp/doom_health_supreme_dfp_8_workers.png
new file mode 100644
index 0000000..c22039a
Binary files /dev/null and b/benchmarks/dfp/doom_health_supreme_dfp_8_workers.png differ
diff --git a/benchmarks/dqn/README.md b/benchmarks/dqn/README.md
new file mode 100644
index 0000000..97f1c5c
--- /dev/null
+++ b/benchmarks/dqn/README.md
@@ -0,0 +1,14 @@
+# DQN
+
+Each experiment uses 3 seeds.
+The parameters used for DQN are the same parameters as described in the [original paper](https://arxiv.org/abs/1607.05077.pdf).
+
+### Breakout DQN - single worker
+
+```bash
+python3 coach.py -p Atari_DQN -lvl breakout
+```
+
+<img src="breakout_dqn.png" alt="Breakout DQN" width="800"/>
+
+
diff --git a/benchmarks/dqn/breakout_dqn.png b/benchmarks/dqn/breakout_dqn.png
new file mode 100644
index 0000000..06fce22
Binary files /dev/null and b/benchmarks/dqn/breakout_dqn.png differ
diff --git a/benchmarks/dueling_ddqn/README.md b/benchmarks/dueling_ddqn/README.md
new file mode 100644
index 0000000..449e5af
--- /dev/null
+++ b/benchmarks/dueling_ddqn/README.md
@@ -0,0 +1,14 @@
+# Dueling DDQN
+
+Each experiment uses 3 seeds and is trained for 10k environment steps.
+The parameters used for Dueling DDQN are the same parameters as described in the [original paper](https://arxiv.org/abs/1706.01502).
+
+### Breakout Dueling DDQN - single worker
+
+```bash
+python3 coach.py -p Atari_Dueling_DDQN -lvl breakout
+```
+
+<img src="breakout_dueling_ddqn.png" alt="Breakout Dueling DDQN" width="800"/>
+
+
diff --git a/benchmarks/dueling_ddqn/breakout_dueling_ddqn.png b/benchmarks/dueling_ddqn/breakout_dueling_ddqn.png
new file mode 100644
index 0000000..10fdd69
Binary files /dev/null and b/benchmarks/dueling_ddqn/breakout_dueling_ddqn.png differ
diff --git a/benchmarks/dueling_ddqn_with_per/README.md b/benchmarks/dueling_ddqn_with_per/README.md
new file mode 100644
index 0000000..6cc83be
--- /dev/null
+++ b/benchmarks/dueling_ddqn_with_per/README.md
@@ -0,0 +1,31 @@
+# Dueling DDQN with Prioritized Experience Replay
+
+Each experiment uses 3 seeds and is trained for 10k environment steps.
+The parameters used for Dueling DDQN with PER are the same parameters as described in the [following paper](https://arxiv.org/abs/1511.05952).
+
+### Breakout Dueling DDQN with PER - single worker
+
+```bash
+python3 coach.py -p Atari_Dueling_DDQN_with_PER_OpenAI -lvl breakout
+```
+
+<img src="breakout_dueling_ddqn_with_per.png" alt="Breakout Dueling DDQN with PER" width="800"/>
+
+
+### Pong Dueling DDQN with PER - single worker
+
+```bash
+python3 coach.py -p Atari_Dueling_DDQN_with_PER_OpenAI -lvl pong
+```
+
+<img src="pong_dueling_ddqn_with_per.png" alt="Pong Dueling DDQN with PER" width="800"/>
+
+
+### Space Invaders Dueling DDQN with PER - single worker
+
+```bash
+python3 coach.py -p Atari_Dueling_DDQN_with_PER_OpenAI -lvl space_invaders
+```
+
+<img src="space_invaders_dueling_ddqn_with_per.png" alt="Space Invaders Dueling DDQN with PER" width="800"/>
+
diff --git a/benchmarks/dueling_ddqn_with_per/breakout_dueling_ddqn_with_per.png b/benchmarks/dueling_ddqn_with_per/breakout_dueling_ddqn_with_per.png
new file mode 100644
index 0000000..b7df622
Binary files /dev/null and b/benchmarks/dueling_ddqn_with_per/breakout_dueling_ddqn_with_per.png differ
diff --git a/benchmarks/dueling_ddqn_with_per/pong_dueling_ddqn_with_per.png b/benchmarks/dueling_ddqn_with_per/pong_dueling_ddqn_with_per.png
new file mode 100644
index 0000000..4f9ae2f
Binary files /dev/null and b/benchmarks/dueling_ddqn_with_per/pong_dueling_ddqn_with_per.png differ
diff --git a/benchmarks/dueling_ddqn_with_per/space_invaders_dueling_ddqn_with_per.png b/benchmarks/dueling_ddqn_with_per/space_invaders_dueling_ddqn_with_per.png
new file mode 100644
index 0000000..8d577e1
Binary files /dev/null and b/benchmarks/dueling_ddqn_with_per/space_invaders_dueling_ddqn_with_per.png differ
diff --git a/benchmarks/img/Ant_A3C_16_workers.png b/benchmarks/img/Ant_A3C_16_workers.png
deleted file mode 100644
index d677ab0..0000000
Binary files a/benchmarks/img/Ant_A3C_16_workers.png and /dev/null differ
diff --git a/benchmarks/img/Breakout_A3C_16_workers.png b/benchmarks/img/Breakout_A3C_16_workers.png
deleted file mode 100644
index 0f778e2..0000000
Binary files a/benchmarks/img/Breakout_A3C_16_workers.png and /dev/null differ
diff --git a/benchmarks/img/CartPole_PG.png b/benchmarks/img/CartPole_PG.png
deleted file mode 100644
index 46779dc..0000000
Binary files a/benchmarks/img/CartPole_PG.png and /dev/null differ
diff --git a/benchmarks/img/Doom_Basic_DQN.png b/benchmarks/img/Doom_Basic_DQN.png
deleted file mode 100644
index 5f9382f..0000000
Binary files a/benchmarks/img/Doom_Basic_DQN.png and /dev/null differ
diff --git a/benchmarks/img/Doom_Basic_Dueling_DDQN.png b/benchmarks/img/Doom_Basic_Dueling_DDQN.png
deleted file mode 100644
index 34478f7..0000000
Binary files a/benchmarks/img/Doom_Basic_Dueling_DDQN.png and /dev/null differ
diff --git a/benchmarks/img/Doom_Basic_NEC.png b/benchmarks/img/Doom_Basic_NEC.png
deleted file mode 100644
index 79b5c6f..0000000
Binary files a/benchmarks/img/Doom_Basic_NEC.png and /dev/null differ
diff --git a/benchmarks/img/Doom_Health_DFP.png b/benchmarks/img/Doom_Health_DFP.png
deleted file mode 100644
index 3f8e16c..0000000
Binary files a/benchmarks/img/Doom_Health_DFP.png and /dev/null differ
diff --git a/benchmarks/img/Doom_Health_MMC.png b/benchmarks/img/Doom_Health_MMC.png
deleted file mode 100644
index d43f66b..0000000
Binary files a/benchmarks/img/Doom_Health_MMC.png and /dev/null differ
diff --git a/benchmarks/img/Hopper_A3C_16_workers.png b/benchmarks/img/Hopper_A3C_16_workers.png
deleted file mode 100644
index 2c2efa7..0000000
Binary files a/benchmarks/img/Hopper_A3C_16_workers.png and /dev/null differ
diff --git a/benchmarks/img/Hopper_ClippedPPO_16_workers.png b/benchmarks/img/Hopper_ClippedPPO_16_workers.png
deleted file mode 100644
index e9821d9..0000000
Binary files a/benchmarks/img/Hopper_ClippedPPO_16_workers.png and /dev/null differ
diff --git a/benchmarks/img/Humanoid_ClippedPPO_16_workers.png b/benchmarks/img/Humanoid_ClippedPPO_16_workers.png
deleted file mode 100644
index 0488c98..0000000
Binary files a/benchmarks/img/Humanoid_ClippedPPO_16_workers.png and /dev/null differ
diff --git a/benchmarks/img/InvertedPendulum_ClippedPPO_16_workers.png b/benchmarks/img/InvertedPendulum_ClippedPPO_16_workers.png
deleted file mode 100644
index b563024..0000000
Binary files a/benchmarks/img/InvertedPendulum_ClippedPPO_16_workers.png and /dev/null differ
diff --git a/benchmarks/img/InvertedPendulum_NAF.png b/benchmarks/img/InvertedPendulum_NAF.png
deleted file mode 100644
index 9b8b6f6..0000000
Binary files a/benchmarks/img/InvertedPendulum_NAF.png and /dev/null differ
diff --git a/benchmarks/img/Inverted_Pendulum_A3C_16_workers.png b/benchmarks/img/Inverted_Pendulum_A3C_16_workers.png
deleted file mode 100644
index d459990..0000000
Binary files a/benchmarks/img/Inverted_Pendulum_A3C_16_workers.png and /dev/null differ
diff --git a/benchmarks/img/Pendulum_DDPG.png b/benchmarks/img/Pendulum_DDPG.png
deleted file mode 100644
index 89abbac..0000000
Binary files a/benchmarks/img/Pendulum_DDPG.png and /dev/null differ
diff --git a/benchmarks/img/Pendulum_NAF.png b/benchmarks/img/Pendulum_NAF.png
deleted file mode 100644
index 0faca93..0000000
Binary files a/benchmarks/img/Pendulum_NAF.png and /dev/null differ
diff --git a/benchmarks/img/Pong_DQN.png b/benchmarks/img/Pong_DQN.png
deleted file mode 100644
index 6122c78..0000000
Binary files a/benchmarks/img/Pong_DQN.png and /dev/null differ
diff --git a/benchmarks/img/Pong_NEC.png b/benchmarks/img/Pong_NEC.png
deleted file mode 100644
index 4148669..0000000
Binary files a/benchmarks/img/Pong_NEC.png and /dev/null differ
diff --git a/benchmarks/qr_dqn/README.md b/benchmarks/qr_dqn/README.md
new file mode 100644
index 0000000..e5f558c
--- /dev/null
+++ b/benchmarks/qr_dqn/README.md
@@ -0,0 +1,21 @@
+# Quantile Regression DQN
+
+Each experiment uses 3 seeds and is trained for 10k environment steps.
+The parameters used for QR-DQN are the same parameters as described in the [original paper](https://arxiv.org/abs/1710.10044.pdf).
+
+### Breakout QR-DQN - single worker
+
+```bash
+python3 coach.py -p Atari_QR_DQN -lvl breakout
+```
+
+<img src="breakout_qr_dqn.png" alt="Breakout QR-DQN" width="800"/>
+
+
+### Pong QR-DQN - single worker
+
+```bash
+python3 coach.py -p Atari_QR_DQN -lvl pong
+```
+
+<img src="pong_qr_dqn.png" alt="Pong QR-DQN" width="800"/>
diff --git a/benchmarks/qr_dqn/breakout_qr_dqn.png b/benchmarks/qr_dqn/breakout_qr_dqn.png
new file mode 100644
index 0000000..09b1c1c
Binary files /dev/null and b/benchmarks/qr_dqn/breakout_qr_dqn.png differ
diff --git a/benchmarks/qr_dqn/pong_qr_dqn.png b/benchmarks/qr_dqn/pong_qr_dqn.png
new file mode 100644
index 0000000..8a39cfe
Binary files /dev/null and b/benchmarks/qr_dqn/pong_qr_dqn.png differ
diff --git a/coach.py b/coach.py
deleted file mode 100644
index 8ba8cf3..0000000
--- a/coach.py
+++ /dev/null
@@ -1,333 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import sys, inspect, re
-import os
-import json
-import presets
-from presets import *
-from utils import set_gpu, list_all_classes_in_module
-from architectures import *
-from environments import *
-from agents import *
-from utils import *
-from logger import screen, logger
-import argparse
-from subprocess import Popen
-import datetime
-import presets
-import atexit
-import sys
-import subprocess
-from threading import Thread
-
-if len(set(failed_imports)) > 0:
-    screen.warning("Warning: failed to import the following packages - {}".format(', '.join(set(failed_imports))))
-
-
-def set_framework(framework_type):
-    # choosing neural network framework
-    framework = Frameworks().get(framework_type)
-    sess = None
-    if framework == Frameworks.TensorFlow:
-        import tensorflow as tf
-        config = tf.ConfigProto()
-        config.allow_soft_placement = True
-        config.gpu_options.allow_growth = True
-        config.gpu_options.per_process_gpu_memory_fraction = 0.2
-        sess = tf.Session(config=config)
-    elif framework == Frameworks.Neon:
-        import ngraph as ng
-        sess = ng.transformers.make_transformer()
-    screen.log_title("Using {} framework".format(Frameworks().to_string(framework)))
-    return sess
-
-
-def check_input_and_fill_run_dict(parser):
-    args = parser.parse_args()
-
-    # if no arg is given
-    if len(sys.argv) == 1:
-        parser.print_help()
-        exit(0)
-
-    # list available presets
-    if args.list:
-        presets_lists = list_all_classes_in_module(presets)
-        screen.log_title("Available Presets:")
-        for preset in presets_lists:
-            print(preset)
-        sys.exit(0)
-
-    # check inputs
-    try:
-        # num_workers = int(args.num_workers)
-        num_workers = int(re.match("^\d+$", args.num_workers).group(0))
-    except ValueError:
-        screen.error("Parameter num_workers should be an integer.")
-
-    preset_names = list_all_classes_in_module(presets)
-    if args.preset is not None and args.preset not in preset_names:
-        screen.error("A non-existing preset was selected. ")
-
-    if args.checkpoint_restore_dir is not None and not os.path.exists(args.checkpoint_restore_dir):
-        screen.error("The requested checkpoint folder to load from does not exist. ")
-
-    if args.save_model_sec is not None:
-        try:
-            args.save_model_sec = int(args.save_model_sec)
-        except ValueError:
-            screen.error("Parameter save_model_sec should be an integer.")
-
-    if args.preset is None and (args.agent_type is None or args.environment_type is None
-                                       or args.exploration_policy_type is None) and not args.play:
-        screen.error('When no preset is given for Coach to run, the user is expected to input the desired agent_type,'
-                     ' environment_type and exploration_policy_type to assemble a preset. '
-                     '\nAt least one of these parameters was not given.')
-    elif args.preset is None and args.play and args.environment_type is None:
-        screen.error('When no preset is given for Coach to run, and the user requests human control over the environment,'
-                     ' the user is expected to input the desired environment_type and level.'
-                     '\nAt least one of these parameters was not given.')
-    elif args.preset is None and args.play and args.environment_type:
-        args.agent_type = 'Human'
-        args.exploration_policy_type = 'ExplorationParameters'
-
-    # get experiment name and path
-    experiment_name = logger.get_experiment_name(args.experiment_name)
-    experiment_path = logger.get_experiment_path(experiment_name)
-
-    if args.play and num_workers > 1:
-        screen.warning("Playing the game as a human is only available with a single worker. "
-                       "The number of workers will be reduced to 1")
-        num_workers = 1
-
-    # fill run_dict
-    run_dict = dict()
-    run_dict['agent_type'] = args.agent_type
-    run_dict['environment_type'] = args.environment_type
-    run_dict['exploration_policy_type'] = args.exploration_policy_type
-    run_dict['level'] = args.level
-    run_dict['preset'] = args.preset
-    run_dict['custom_parameter'] = args.custom_parameter
-    run_dict['experiment_path'] = experiment_path
-    run_dict['framework'] = Frameworks().get(args.framework)
-    run_dict['play'] = args.play
-    run_dict['evaluate'] = args.evaluate# or args.play
-
-    # multi-threading parameters
-    run_dict['num_threads'] = num_workers
-
-    # checkpoints
-    run_dict['save_model_sec'] = args.save_model_sec
-    run_dict['save_model_dir'] = experiment_path if args.save_model_sec is not None else None
-    run_dict['checkpoint_restore_dir'] = args.checkpoint_restore_dir
-
-    # visualization
-    run_dict['visualization.dump_gifs'] = args.dump_gifs
-    run_dict['visualization.render'] = args.render
-    run_dict['visualization.tensorboard'] = args.tensorboard
-
-    return args, run_dict
-
-
-def run_dict_to_json(_run_dict, task_id=''):
-    if task_id != '':
-        json_path = os.path.join(_run_dict['experiment_path'], 'run_dict_worker{}.json'.format(task_id))
-    else:
-        json_path = os.path.join(_run_dict['experiment_path'], 'run_dict.json')
-
-    with open(json_path, 'w') as outfile:
-        json.dump(_run_dict, outfile, indent=2)
-
-    return json_path
-
-
-if __name__ == "__main__":
-    parser = argparse.ArgumentParser()
-    parser.add_argument('-p', '--preset',
-                        help="(string) Name of a preset to run (as configured in presets.py)",
-                        default=None,
-                        type=str)
-    parser.add_argument('-l', '--list',
-                        help="(flag) List all available presets",
-                        action='store_true')
-    parser.add_argument('-e', '--experiment_name',
-                        help="(string) Experiment name to be used to store the results.",
-                        default='',
-                        type=str)
-    parser.add_argument('-r', '--render',
-                        help="(flag) Render environment",
-                        action='store_true')
-    parser.add_argument('-f', '--framework',
-                        help="(string) Neural network framework. Available values: tensorflow, neon",
-                        default='tensorflow',
-                        type=str)
-    parser.add_argument('-n', '--num_workers',
-                        help="(int) Number of workers for multi-process based agents, e.g. A3C",
-                        default='1',
-                        type=str)
-    parser.add_argument('--play',
-                        help="(flag) Play as a human by controlling the game with the keyboard. "
-                             "This option will save a replay buffer with the game play.",
-                        action='store_true')
-    parser.add_argument('--evaluate',
-                        help="(flag) Run evaluation only. This is a convenient way to disable "
-                             "training in order to evaluate an existing checkpoint.",
-                        action='store_true')
-    parser.add_argument('-v', '--verbose',
-                        help="(flag) Don't suppress TensorFlow debug prints.",
-                        action='store_true')
-    parser.add_argument('-s', '--save_model_sec',
-                        help="(int) Time in seconds between saving checkpoints of the model.",
-                        default=None,
-                        type=int)
-    parser.add_argument('-crd', '--checkpoint_restore_dir',
-                        help='(string) Path to a folder containing a checkpoint to restore the model from.',
-                        type=str)
-    parser.add_argument('-dg', '--dump_gifs',
-                        help="(flag) Enable the gif saving functionality.",
-                        action='store_true')
-    parser.add_argument('-at', '--agent_type',
-                        help="(string) Choose an agent type class to override on top of the selected preset. "
-                             "If no preset is defined, a preset can be set from the command-line by combining settings "
-                             "which are set by using --agent_type, --experiment_type, --environemnt_type",
-                        default=None,
-                        type=str)
-    parser.add_argument('-et', '--environment_type',
-                        help="(string) Choose an environment type class to override on top of the selected preset."
-                             "If no preset is defined, a preset can be set from the command-line by combining settings "
-                             "which are set by using --agent_type, --experiment_type, --environemnt_type",
-                        default=None,
-                        type=str)
-    parser.add_argument('-ept', '--exploration_policy_type',
-                        help="(string) Choose an exploration policy type class to override on top of the selected "
-                             "preset."
-                             "If no preset is defined, a preset can be set from the command-line by combining settings "
-                             "which are set by using --agent_type, --experiment_type, --environemnt_type"
-                        ,
-                        default=None,
-                        type=str)
-    parser.add_argument('-lvl', '--level',
-                        help="(string) Choose the level that will be played in the environment that was selected."
-                             "This value will override the level parameter in the environment class."
-                        ,
-                        default=None,
-                        type=str)
-    parser.add_argument('-cp', '--custom_parameter',
-                        help="(string) Semicolon separated parameters used to override specific parameters on top of"
-                             " the selected preset (or on top of the command-line assembled one). "
-                             "Whenever a parameter value is a string, it should be inputted as '\\\"string\\\"'. "
-                             "For ex.: "
-                             "\"visualization.render=False; num_training_iterations=500; optimizer='rmsprop'\"",
-                        default=None,
-                        type=str)
-    parser.add_argument('--print_parameters',
-                        help="(flag) Print tuning_parameters to stdout",
-                        action='store_true')
-    parser.add_argument('-tb', '--tensorboard',
-                        help="(flag) When using the TensorFlow backend, enable TensorBoard log dumps. ",
-                        action='store_true')
-    parser.add_argument('-ns', '--no_summary',
-                        help="(flag) Prevent Coach from printing a summary and asking questions at the end of runs",
-                        action='store_true')
-
-    args, run_dict = check_input_and_fill_run_dict(parser)
-
-    # turn TF debug prints off
-    if not args.verbose and args.framework.lower() == 'tensorflow':
-        os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
-
-    # dump documentation
-    logger.set_dump_dir(run_dict['experiment_path'], add_timestamp=True)
-    if not args.no_summary:
-        atexit.register(logger.summarize_experiment)
-        screen.change_terminal_title(logger.experiment_name)
-
-    # Single-threaded runs
-    if run_dict['num_threads'] == 1:
-        # set tuning parameters
-        json_run_dict_path = run_dict_to_json(run_dict)
-        tuning_parameters = json_to_preset(json_run_dict_path)
-        tuning_parameters.sess = set_framework(args.framework)
-
-        if args.print_parameters:
-            print('tuning_parameters', tuning_parameters)
-
-        # Single-thread runs
-        tuning_parameters.task_index = 0
-        env_instance = create_environment(tuning_parameters)
-        agent = eval(tuning_parameters.agent.type + '(env_instance, tuning_parameters)')
-
-        # Start the training or evaluation
-        if tuning_parameters.evaluate:
-            agent.evaluate(sys.maxsize, keep_networks_synced=True)  # evaluate forever
-        else:
-            agent.improve()
-
-    # Multi-threaded runs
-    else:
-        assert args.framework.lower() == 'tensorflow', "Distributed training works only with TensorFlow"
-        os.environ["OMP_NUM_THREADS"]="1"
-        # set parameter server and workers addresses
-        ps_hosts = "localhost:{}".format(get_open_port())
-        worker_hosts = ",".join(["localhost:{}".format(get_open_port()) for i in range(run_dict['num_threads'] + 1)])
-
-        # Make sure to disable GPU so that all the workers will use the CPU
-        set_cpu()
-
-        # create a parameter server
-        cmd = [
-            "python3",
-           "./parallel_actor.py",
-           "--ps_hosts={}".format(ps_hosts),
-           "--worker_hosts={}".format(worker_hosts),
-           "--job_name=ps",
-        ]
-        parameter_server = Popen(cmd)
-
-        screen.log_title("*** Distributed Training ***")
-        time.sleep(1)
-
-        # create N training workers and 1 evaluating worker
-        workers = []
-
-        for i in range(run_dict['num_threads'] + 1):
-            # this is the evaluation worker
-            run_dict['task_id'] = i
-            if i == run_dict['num_threads']:
-                run_dict['evaluate_only'] = True
-                run_dict['visualization.render'] = args.render
-            else:
-                run_dict['evaluate_only'] = False
-                run_dict['visualization.render'] = False  # #In a parallel setting, only the evaluation agent renders
-
-            json_run_dict_path = run_dict_to_json(run_dict, i)
-            workers_args = ["python3", "./parallel_actor.py",
-                            "--ps_hosts={}".format(ps_hosts),
-                            "--worker_hosts={}".format(worker_hosts),
-                            "--job_name=worker",
-                            "--load_json={}".format(json_run_dict_path)]
-
-            p = Popen(workers_args)
-
-            if i != run_dict['num_threads']:
-                workers.append(p)
-            else:
-                evaluation_worker = p
-
-        # wait for all workers
-        [w.wait() for w in workers]
-        evaluation_worker.kill()
diff --git a/configurations.py b/configurations.py
deleted file mode 100644
index a235c6c..0000000
--- a/configurations.py
+++ /dev/null
@@ -1,628 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from utils import Enum
-import json
-import types
-
-
-class Frameworks(Enum):
-    TensorFlow = 1
-    Neon = 2
-
-
-class InputTypes(object):
-    Observation = 1
-    Measurements = 2
-    GoalVector = 3
-    Action = 4
-    TimedObservation = 5
-
-
-class OutputTypes(object):
-    Q = 1
-    DuelingQ = 2
-    V = 3
-    Pi = 4
-    MeasurementsPrediction = 5
-    DNDQ = 6
-    NAF = 7
-    PPO = 8
-    PPO_V = 9
-    CategoricalQ = 10
-    QuantileRegressionQ = 11
-
-
-
-class EmbedderDepth(object):
-    Shallow = 1
-    Deep = 2
-
-
-class EmbedderWidth(object):
-    Narrow = 1
-    Wide = 2
-
-
-class MiddlewareTypes(object):
-    LSTM = 1
-    FC = 2
-
-
-class Parameters(object):
-    def __str__(self):
-        parameters = {}
-        for k, v in self.__dict__.items():
-            if isinstance(v, type) and issubclass(v, Parameters):
-                # v.__dict__ doesn't return a dictionary but a mappingproxy
-                # which json doesn't serialize, so convert it into a normal
-                # dictionary
-                parameters[k] = dict(v.__dict__.items())
-            elif isinstance(v, types.MappingProxyType):
-                parameters[k] = dict(v.items())
-            else:
-                parameters[k] = v
-
-        return json.dumps(parameters, indent=4, default=repr)
-
-
-class AgentParameters(Parameters):
-    agent = ''
-
-    # Architecture parameters
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.Q]
-    middleware_type = MiddlewareTypes.FC
-    loss_weights = [1.0]
-    stop_gradients_from_head = [False]
-    embedder_depth = EmbedderDepth.Shallow
-    embedder_width = EmbedderWidth.Wide
-    num_output_head_copies = 1
-    use_measurements = False
-    use_accumulated_reward_as_measurement = False
-    add_a_normalized_timestep_to_the_observation = False
-    l2_regularization = 0
-    hidden_layers_activation_function = 'relu'
-    optimizer_type = 'Adam'
-    async_training = False
-    use_separate_networks_per_head = False
-
-    # Agent parameters
-    num_consecutive_playing_steps = 1
-    num_consecutive_training_steps = 1
-    update_evaluation_agent_network_after_every_num_steps = 3000
-    bootstrap_total_return_from_old_policy = False
-    n_step = -1
-    num_episodes_in_experience_replay = 200
-    num_transitions_in_experience_replay = None
-    discount = 0.99
-    policy_gradient_rescaler = 'A_VALUE'
-    apply_gradients_every_x_episodes = 5
-    beta_entropy = 0
-    num_steps_between_gradient_updates = 20000  # t_max
-    num_steps_between_copying_online_weights_to_target = 1000
-    rate_for_copying_weights_to_target = 1.0
-    monte_carlo_mixing_rate = 0.1
-    gae_lambda = 0.96
-    step_until_collecting_full_episodes = False
-    targets_horizon = 'N-Step'
-    replace_mse_with_huber_loss = False
-    load_memory_from_file_path = None
-    collect_new_data = True
-    input_rescaler = 255.0
-
-    # PPO related params
-    target_kl_divergence = 0.01
-    initial_kl_coefficient = 1.0
-    high_kl_penalty_coefficient = 1000
-    value_targets_mix_fraction = 0.1
-    clip_likelihood_ratio_using_epsilon = None
-    use_kl_regularization = True
-    estimate_value_using_gae = False
-
-    # DFP related params
-    num_predicted_steps_ahead = 6
-    goal_vector = [1.0, 1.0]
-    future_measurements_weights = [0.5, 0.5, 1.0]
-
-    # NEC related params
-    dnd_size = 500000
-    l2_norm_added_delta = 0.001
-    new_value_shift_coefficient = 0.1
-    number_of_knn = 50
-    DND_key_error_threshold = 0.01
-
-    # Framework support
-    neon_support = False
-    tensorflow_support = True
-
-    # distributed agents params
-    shared_optimizer = True
-    share_statistics_between_workers = True
-
-
-class EnvironmentParameters(Parameters):
-    type = 'Doom'
-    level = 'basic'
-    observation_stack_size = 4
-    frame_skip = 4
-    desired_observation_width = 76
-    desired_observation_height = 60
-    normalize_observation = False
-    crop_observation = False
-    random_initialization_steps = 0
-    reward_scaling = 1.0
-    reward_clipping_min = None
-    reward_clipping_max = None
-    human_control = False
-
-
-class ExplorationParameters(Parameters):
-    # Exploration policies
-    policy = 'EGreedy'
-    evaluation_policy = 'Greedy'
-    # -- bootstrap dqn parameters
-    bootstrapped_data_sharing_probability = 0.5
-    architecture_num_q_heads = 1
-    # -- dropout approximation of thompson sampling parameters
-    dropout_discard_probability = 0
-    initial_keep_probability = 0.0  # unused
-    final_keep_probability = 0.99  # unused
-    keep_probability_decay_steps = 50000  # unused
-    # -- epsilon greedy parameters
-    initial_epsilon = 0.5
-    final_epsilon = 0.01
-    epsilon_decay_steps = 50000
-    evaluation_epsilon = 0.05
-    # -- epsilon greedy at end of episode parameters
-    average_episode_length_over_num_episodes = 20
-    # -- boltzmann softmax parameters
-    initial_temperature = 100.0
-    final_temperature = 1.0
-    temperature_decay_steps = 50000
-    # -- additive noise
-    initial_noise_variance_percentage = 0.1
-    final_noise_variance_percentage = 0.1
-    noise_variance_decay_steps = 1
-    # -- Ornstein-Uhlenbeck process
-    mu = 0
-    theta = 0.15
-    sigma = 0.3
-    dt = 0.01
-
-
-class GeneralParameters(Parameters):
-    train = True
-    framework = Frameworks.TensorFlow
-    threads = 1
-    sess = None
-
-    # distributed training options
-    num_threads = 1
-    synchronize_over_num_threads = 1
-    distributed = False
-
-    # Agent blocks
-    memory = 'EpisodicExperienceReplay'
-    architecture = 'GeneralTensorFlowNetwork'
-
-    # General parameters
-    clip_gradients = None
-    kl_divergence_constraint = 100000
-    num_training_iterations = 10000000000
-    num_heatup_steps = 1000
-    heatup_using_network_decisions = False
-    batch_size = 32
-    save_model_sec = None
-    save_model_dir = None
-    checkpoint_restore_dir = None
-    learning_rate = 0.00025
-    learning_rate_decay_rate = 0
-    learning_rate_decay_steps = 0
-    evaluation_episodes = 5
-    evaluate_every_x_episodes = 1000000
-    evaluate_every_x_training_iterations = 0
-    rescaling_interpolation_type = 'bilinear'
-    current_episode = 0
-
-    # setting a seed will only work for non-parallel algorithms. Parallel algorithms add uncontrollable noise in
-    # the form of different workers starting at different times, and getting different assignments of CPU
-    # time from the OS.
-    seed = None
-
-    checkpoints_path = ''
-
-    # Testing parameters
-    test = False
-    test_min_return_threshold = 0
-    test_max_step_threshold = 1
-    test_num_workers = 1
-
-
-class VisualizationParameters(Parameters):
-    # Visualization parameters
-    record_video_every = 1000
-    video_path = '/home/llt_lab/temp/breakout-videos'
-    plot_action_values_online = False
-    show_saliency_maps_every_num_episodes = 1000000000
-    render_observation = False
-    print_summary = False
-    dump_csv = True
-    dump_signals_to_csv_every_x_episodes = 5
-    render = False
-    dump_gifs = True
-    max_fps_for_human_control = 10
-    tensorboard = False
-
-
-class Roboschool(EnvironmentParameters):
-    type = 'Gym'
-    frame_skip = 1
-    observation_stack_size = 1
-    desired_observation_height = None
-    desired_observation_width = None
-
-
-class GymVectorObservation(EnvironmentParameters):
-    type = 'Gym'
-    frame_skip = 1
-    observation_stack_size = 1
-    desired_observation_height = None
-    desired_observation_width = None
-
-
-class Bullet(EnvironmentParameters):
-    type = 'Bullet'
-    frame_skip = 1
-    observation_stack_size = 1
-    desired_observation_height = None
-    desired_observation_width = None
-
-
-class Atari(EnvironmentParameters):
-    type = 'Gym'
-    frame_skip = 4
-    observation_stack_size = 4
-    desired_observation_height = 84
-    desired_observation_width = 84
-    reward_clipping_max = 1.0
-    reward_clipping_min = -1.0
-    random_initialization_steps = 30
-    crop_observation = False  # in the original paper the observation is cropped but not in the Nature paper
-
-
-class Doom(EnvironmentParameters):
-    type = 'Doom'
-    frame_skip = 4
-    observation_stack_size = 3
-    desired_observation_height = 60
-    desired_observation_width = 76
-
-
-class Carla(EnvironmentParameters):
-    type = 'Carla'
-    frame_skip = 1
-    observation_stack_size = 4
-    desired_observation_height = 128
-    desired_observation_width = 180
-    normalize_observation = False
-    server_height = 256
-    server_width = 360
-    config = 'environments/CarlaSettings.ini'
-    level = 'town1'
-    verbose = True
-    stereo = False
-    semantic_segmentation = False
-    depth = False
-    episode_max_time = 100000  # miliseconds for each episode
-    continuous_to_bool_threshold = 0.5
-    allow_braking = False
-
-
-class Human(AgentParameters):
-    type = 'HumanAgent'
-    num_episodes_in_experience_replay = 10000000
-
-
-class NStepQ(AgentParameters):
-    type = 'NStepQAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.Q]
-    loss_weights = [1.0]
-    optimizer_type = 'Adam'
-    num_steps_between_copying_online_weights_to_target = 1000
-    num_episodes_in_experience_replay = 2
-    apply_gradients_every_x_episodes = 1
-    num_steps_between_gradient_updates = 20  # this is called t_max in all the papers
-    hidden_layers_activation_function = 'elu'
-    targets_horizon = 'N-Step'
-    async_training = True
-    shared_optimizer = True
-
-
-class DQN(AgentParameters):
-    type = 'DQNAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.Q]
-    loss_weights = [1.0]
-    optimizer_type = 'Adam'
-    num_steps_between_copying_online_weights_to_target = 1000
-    neon_support = True
-    async_training = True
-    shared_optimizer = True
-
-
-class DDQN(DQN):
-    type = 'DDQNAgent'
-    num_steps_between_copying_online_weights_to_target = 30000
-
-
-class DuelingDQN(DQN):
-    type = 'DQNAgent'
-    output_types = [OutputTypes.DuelingQ]
-
-
-class BootstrappedDQN(DQN):
-    type = 'BootstrappedDQNAgent'
-    num_output_head_copies = 10
-
-
-class CategoricalDQN(DQN):
-    type = 'CategoricalDQNAgent'
-    output_types = [OutputTypes.CategoricalQ]
-    v_min = -10.0
-    v_max = 10.0
-    atoms = 51
-    neon_support = False
-
-
-class QuantileRegressionDQN(DQN):
-    type = 'QuantileRegressionDQNAgent'
-    output_types = [OutputTypes.QuantileRegressionQ]
-    atoms = 51
-
-
-class NEC(AgentParameters):
-    type = 'NECAgent'
-    optimizer_type = 'Adam'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.DNDQ]
-    loss_weights = [1.0]
-    dnd_size = 500000
-    l2_norm_added_delta = 0.001
-    new_value_shift_coefficient = 0.1  # alpha
-    number_of_knn = 50
-    n_step = 100
-    bootstrap_total_return_from_old_policy = True
-    DND_key_error_threshold = 0
-    input_rescaler = 1.0
-    num_consecutive_playing_steps = 4
-
-
-class ActorCritic(AgentParameters):
-    type = 'ActorCriticAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.V, OutputTypes.Pi]
-    loss_weights = [0.5, 1.0]
-    stop_gradients_from_head = [False, False]
-    num_episodes_in_experience_replay = 2
-    policy_gradient_rescaler = 'A_VALUE'
-    hidden_layers_activation_function = 'elu'
-    apply_gradients_every_x_episodes = 5
-    beta_entropy = 0
-    num_steps_between_gradient_updates = 5000  # this is called t_max in all the papers
-    gae_lambda = 0.96
-    shared_optimizer = True
-    estimate_value_using_gae = False
-    async_training = True
-
-
-class PolicyGradient(AgentParameters):
-    type = 'PolicyGradientsAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.Pi]
-    loss_weights = [1.0]
-    num_episodes_in_experience_replay = 2
-    policy_gradient_rescaler = 'FUTURE_RETURN_NORMALIZED_BY_TIMESTEP'
-    apply_gradients_every_x_episodes = 5
-    beta_entropy = 0
-    num_steps_between_gradient_updates = 20000  # this is called t_max in all the papers
-    async_training = True
-
-
-class DDPG(AgentParameters):
-    type = 'DDPGAgent'
-    input_types = {'observation': InputTypes.Observation, 'action': InputTypes.Action}
-    output_types = [OutputTypes.V]  # V is used because we only want a single Q value
-    loss_weights = [1.0]
-    hidden_layers_activation_function = 'relu'
-    num_episodes_in_experience_replay = 10000
-    num_steps_between_copying_online_weights_to_target = 1
-    rate_for_copying_weights_to_target = 0.001
-    shared_optimizer = True
-    async_training = True
-
-
-class DDDPG(AgentParameters):
-    type = 'DDPGAgent'
-    input_types = {'observation': InputTypes.Observation, 'action': InputTypes.Action}
-    output_types = [OutputTypes.V]  # V is used because we only want a single Q value
-    loss_weights = [1.0]
-    hidden_layers_activation_function = 'relu'
-    num_episodes_in_experience_replay = 10000
-    num_steps_between_copying_online_weights_to_target = 10
-    rate_for_copying_weights_to_target = 1
-    shared_optimizer = True
-    async_training = True
-
-
-class NAF(AgentParameters):
-    type = 'NAFAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.NAF]
-    loss_weights = [1.0]
-    hidden_layers_activation_function = 'tanh'
-    num_consecutive_training_steps = 5
-    num_steps_between_copying_online_weights_to_target = 1
-    rate_for_copying_weights_to_target = 0.001
-    optimizer_type = 'RMSProp'
-    async_training = True
-
-
-class PPO(AgentParameters):
-    type = 'PPOAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.V]
-    loss_weights = [1.0]
-    hidden_layers_activation_function = 'tanh'
-    num_episodes_in_experience_replay = 1000000
-    policy_gradient_rescaler = 'A_VALUE'
-    gae_lambda = 0.96
-    target_kl_divergence = 0.01
-    initial_kl_coefficient = 1.0
-    high_kl_penalty_coefficient = 1000
-    add_a_normalized_timestep_to_the_observation = True
-    l2_regularization = 0#1e-3
-    value_targets_mix_fraction = 0.1
-    async_training = True
-    estimate_value_using_gae = True
-    step_until_collecting_full_episodes = True
-
-
-class ClippedPPO(AgentParameters):
-    type = 'ClippedPPOAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.V, OutputTypes.PPO]
-    loss_weights = [0.5, 1.0]
-    stop_gradients_from_head = [False, False]
-    hidden_layers_activation_function = 'tanh'
-    num_episodes_in_experience_replay = 1000000
-    policy_gradient_rescaler = 'GAE'
-    gae_lambda = 0.95
-    target_kl_divergence = 0.01
-    initial_kl_coefficient = 1.0
-    high_kl_penalty_coefficient = 1000
-    add_a_normalized_timestep_to_the_observation = False
-    l2_regularization = 1e-3
-    value_targets_mix_fraction = 0.1
-    clip_likelihood_ratio_using_epsilon = 0.2
-    async_training = False
-    use_kl_regularization = False
-    estimate_value_using_gae = True
-    batch_size = 64
-    use_separate_networks_per_head = True
-    step_until_collecting_full_episodes = True
-    beta_entropy = 0.01
-
-
-class DFP(AgentParameters):
-    type = 'DFPAgent'
-    input_types = {
-        'observation': InputTypes.Observation,
-        'measurements': InputTypes.Measurements,
-        'goal': InputTypes.GoalVector
-    }
-    output_types = [OutputTypes.MeasurementsPrediction]
-    loss_weights = [1.0]
-    use_measurements = True
-    num_predicted_steps_ahead = 6
-    goal_vector = [1.0, 1.0]
-    future_measurements_weights = [0.5, 0.5, 1.0]
-    async_training = True
-
-
-class MMC(AgentParameters):
-    type = 'MixedMonteCarloAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.Q]
-    loss_weights = [1.0]
-    num_steps_between_copying_online_weights_to_target = 1000
-    monte_carlo_mixing_rate = 0.1
-    neon_support = True
-
-
-class PAL(AgentParameters):
-    type = 'PALAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.Q]
-    loss_weights = [1.0]
-    pal_alpha = 0.9
-    persistent_advantage_learning = False
-    num_steps_between_copying_online_weights_to_target = 1000
-    neon_support = True
-
-
-class BC(AgentParameters):
-    type = 'BCAgent'
-    input_types = {'observation': InputTypes.Observation}
-    output_types = [OutputTypes.Q]
-    loss_weights = [1.0]
-    collect_new_data = False
-    evaluate_every_x_training_iterations = 50000
-
-
-class EGreedyExploration(ExplorationParameters):
-    policy = 'EGreedy'
-    initial_epsilon = 0.5
-    final_epsilon = 0.01
-    epsilon_decay_steps = 50000
-    evaluation_epsilon = 0.05
-    initial_noise_variance_percentage = 0.1
-    final_noise_variance_percentage = 0.1
-    noise_variance_decay_steps = 50000
-
-
-class BootstrappedDQNExploration(ExplorationParameters):
-    policy = 'Bootstrapped'
-    architecture_num_q_heads = 10
-    bootstrapped_data_sharing_probability = 0.1
-
-
-class OUExploration(ExplorationParameters):
-    policy = 'OUProcess'
-    mu = 0
-    theta = 0.15
-    sigma = 0.3
-    dt = 0.01
-
-
-class AdditiveNoiseExploration(ExplorationParameters):
-    policy = 'AdditiveNoise'
-    initial_noise_variance_percentage = 0.1
-    final_noise_variance_percentage = 0.1
-    noise_variance_decay_steps = 50000
-
-
-class EntropyExploration(ExplorationParameters):
-    policy = 'ContinuousEntropy'
-
-
-class CategoricalExploration(ExplorationParameters):
-    policy = 'Categorical'
-
-
-class Preset(GeneralParameters):
-    def __init__(self, agent, env, exploration, visualization=VisualizationParameters):
-        """
-        :type agent: AgentParameters
-        :type env: EnvironmentParameters
-        :type exploration: ExplorationParameters
-        :type visualization: VisualizationParameters
-        """
-        self.visualization = visualization
-        self.agent = agent
-        self.env = env
-        self.exploration = exploration
diff --git a/dashboard_components/boards.py b/dashboard_components/boards.py
deleted file mode 100644
index dcbb5a3..0000000
--- a/dashboard_components/boards.py
+++ /dev/null
@@ -1,18 +0,0 @@
-from bokeh.layouts import column
-from bokeh.models.widgets import Panel, Tabs
-from dashboard_components.experiment_board import experiment_board_layout
-from dashboard_components.globals import spinner, layouts
-from bokeh.models.widgets import Div
-
-# ---------------- Build Website Layout -------------------
-
-# title
-title = Div(text="""<h1>Coach Dashboard</h1>""")
-
-tab1 = Panel(child=experiment_board_layout, title='experiment board')
-tabs = Tabs(tabs=[tab1])
-
-layout = column(title, tabs)
-layout = column(layout, spinner)
-
-layouts['boards'] = layout
diff --git a/dashboard_components/signals_file.py b/dashboard_components/signals_file.py
deleted file mode 100644
index 1e89e18..0000000
--- a/dashboard_components/signals_file.py
+++ /dev/null
@@ -1,39 +0,0 @@
-import os
-
-import pandas as pd
-from pandas.errors import EmptyDataError
-
-from dashboard_components.signals_file_base import SignalsFileBase
-from utils import break_file_path
-
-
-class SignalsFile(SignalsFileBase):
-    def __init__(self, csv_path, load=True, plot=None):
-        super().__init__(plot)
-        self.full_csv_path = csv_path
-        self.dir, self.filename, _ = break_file_path(csv_path)
-        if load:
-            self.load()
-            # this helps set the correct x axis
-            self.change_averaging_window(1, force=True)
-
-    def load_csv(self):
-        # load csv and fix sparse data.
-        # csv can be in the middle of being written so we use try - except
-        self.csv = None
-        while self.csv is None:
-            try:
-                self.csv = pd.read_csv(self.full_csv_path)
-                break
-            except EmptyDataError:
-                self.csv = None
-                continue
-        self.csv = self.csv.interpolate()
-        self.csv.fillna(value=0, inplace=True)
-
-        self.csv['Wall-Clock Time'] /= 60.
-
-        self.last_modified = os.path.getmtime(self.full_csv_path)
-
-    def file_was_modified_on_disk(self):
-        return self.last_modified != os.path.getmtime(self.full_csv_path)
\ No newline at end of file
diff --git a/docs/404.html b/docs/404.html
new file mode 100644
index 0000000..0779c3a
--- /dev/null
+++ b/docs/404.html
@@ -0,0 +1,244 @@
+<!DOCTYPE html>
+<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
+<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
+<head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  
+  
+  <link rel="shortcut icon" href="/img/favicon.ico">
+  <title>Reinforcement Learning Coach</title>
+  <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
+
+  <link rel="stylesheet" href="/css/theme.css" type="text/css" />
+  <link rel="stylesheet" href="/css/theme_extra.css" type="text/css" />
+  <link rel="stylesheet" href="/css/highlight.css">
+  <link href="/extra.css" rel="stylesheet">
+  
+  <script src="/js/jquery-2.1.1.min.js"></script>
+  <script src="/js/modernizr-2.8.3.min.js"></script>
+  <script type="text/javascript" src="/js/highlight.pack.js"></script> 
+  
+</head>
+
+<body class="wy-body-for-nav" role="document">
+
+  <div class="wy-grid-for-nav">
+
+    
+    <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
+      <div class="wy-side-nav-search">
+        <a href="/" class="icon icon-home"> Reinforcement Learning Coach</a>
+        <div role="search">
+  <form id ="rtd-search-form" class="wy-form" action="/search.html" method="get">
+    <input type="text" name="q" placeholder="Search docs" />
+  </form>
+</div>
+      </div>
+
+      <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
+	<ul class="current">
+	  
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="/">Home</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="/usage/">Usage</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="/design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/design/filters/">Filters</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="/dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="/contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="/contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
+          
+        </ul>
+      </div>
+      &nbsp;
+    </nav>
+
+    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
+
+      
+      <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
+        <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
+        <a href="/">Reinforcement Learning Coach</a>
+      </nav>
+
+      
+      <div class="wy-nav-content">
+        <div class="rst-content">
+          <div role="navigation" aria-label="breadcrumbs navigation">
+  <ul class="wy-breadcrumbs">
+    <li><a href="/">Docs</a> &raquo;</li>
+    
+    
+    <li class="wy-breadcrumbs-aside">
+      
+    </li>
+  </ul>
+  <hr/>
+</div>
+          <div role="main">
+            <div class="section">
+              
+
+  <h1 id="404-page-not-found">404</h1>
+
+  <p><strong>Page not found</strong></p>
+
+
+            </div>
+          </div>
+          <footer>
+  
+
+  <hr/>
+
+  <div role="contentinfo">
+    <!-- Copyright etc -->
+    
+  </div>
+
+  Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
+</footer>
+      
+        </div>
+      </div>
+
+    </section>
+
+  </div>
+
+  <div class="rst-versions" role="note" style="cursor: pointer">
+    <span class="rst-current-version" data-toggle="rst-current-version">
+      
+      
+      
+    </span>
+</div>
+    <script>var base_url = '';</script>
+    <script src="/js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="/search/require.js"></script>
+      <script src="/search/search.js"></script>
+
+</body>
+</html>
diff --git a/architectures/neon_components/__init__.py b/docs/__init__.py
similarity index 100%
rename from architectures/neon_components/__init__.py
rename to docs/__init__.py
diff --git a/docs/algorithms/imitation/bc/index.html b/docs/algorithms/imitation/bc/index.html
index c185dec..cb972a2 100644
--- a/docs/algorithms/imitation/bc/index.html
+++ b/docs/algorithms/imitation/bc/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Behavioral Cloning - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Behavioral Cloning - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Behavioral Cloning";
+    var mkdocs_page_input_path = "algorithms/imitation/bc.md";
+    var mkdocs_page_url = "/algorithms/imitation/bc/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Behavioral Cloning</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#behavioral-cloning">Behavioral Cloning</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Behavioral Cloning</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#behavioral-cloning">Behavioral Cloning</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -301,10 +252,10 @@ The training goal is to reduce the difference between the actions predicted by t
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../../../dashboard/index.html" class="btn btn-neutral float-right" title="Coach Dashboard"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../../../dashboard/" class="btn btn-neutral float-right" title="Coach Dashboard">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../../other/dfp/index.html" class="btn btn-neutral" title="Direct Future Prediction"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../../other/dfp/" class="btn btn-neutral" title="Direct Future Prediction"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -318,7 +269,7 @@ The training goal is to reduce the difference between the actions predicted by t
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -326,17 +277,22 @@ The training goal is to reduce the difference between the actions predicted by t
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../../other/dfp/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../../other/dfp/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../../../dashboard/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../../../dashboard/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/other/dfp/index.html b/docs/algorithms/other/dfp/index.html
index da714bd..c40d187 100644
--- a/docs/algorithms/other/dfp/index.html
+++ b/docs/algorithms/other/dfp/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Direct Future Prediction - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Direct Future Prediction - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Direct Future Prediction";
+    var mkdocs_page_input_path = "algorithms/other/dfp.md";
+    var mkdocs_page_url = "/algorithms/other/dfp/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Direct Future Prediction</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#direct-future-prediction">Direct Future Prediction</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Direct Future Prediction</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#direct-future-prediction">Direct Future Prediction</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -302,10 +253,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../../imitation/bc/index.html" class="btn btn-neutral float-right" title="Behavioral Cloning"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../../imitation/bc/" class="btn btn-neutral float-right" title="Behavioral Cloning">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../../policy_optimization/cppo/index.html" class="btn btn-neutral" title="Clipped Proximal Policy Optimization"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../../policy_optimization/cppo/" class="btn btn-neutral" title="Clipped Proximal Policy Optimization"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -319,7 +270,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -327,17 +278,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../../policy_optimization/cppo/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../../policy_optimization/cppo/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../../imitation/bc/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../../imitation/bc/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/policy_optimization/ac/index.html b/docs/algorithms/policy_optimization/ac/index.html
index 35705bf..feae706 100644
--- a/docs/algorithms/policy_optimization/ac/index.html
+++ b/docs/algorithms/policy_optimization/ac/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Actor-Critic - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Actor-Critic - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Actor-Critic";
+    var mkdocs_page_input_path = "algorithms/policy_optimization/ac.md";
+    var mkdocs_page_url = "/algorithms/policy_optimization/ac/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Actor-Critic</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#actor-critic">Actor-Critic</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pg/">Policy Gradient</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Actor-Critic</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#actor-critic">Actor-Critic</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -302,10 +253,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../ddpg/index.html" class="btn btn-neutral float-right" title="Deep Determinstic Policy Gradients"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../ddpg/" class="btn btn-neutral float-right" title="Deep Determinstic Policy Gradients">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../pg/index.html" class="btn btn-neutral" title="Policy Gradient"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../pg/" class="btn btn-neutral" title="Policy Gradient"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -319,7 +270,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -327,17 +278,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../pg/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../pg/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../ddpg/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../ddpg/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/policy_optimization/cppo/index.html b/docs/algorithms/policy_optimization/cppo/index.html
index 5dcf783..c0a2146 100644
--- a/docs/algorithms/policy_optimization/cppo/index.html
+++ b/docs/algorithms/policy_optimization/cppo/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Clipped Proximal Policy Optimization - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Clipped Proximal Policy Optimization - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Clipped Proximal Policy Optimization";
+    var mkdocs_page_input_path = "algorithms/policy_optimization/cppo.md";
+    var mkdocs_page_url = "/algorithms/policy_optimization/cppo/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Clipped Proximal Policy Optimization</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#clipped-proximal-policy-optimization">Clipped Proximal Policy Optimization</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Clipped Proximal Policy Optimization</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#clipped-proximal-policy-optimization">Clipped Proximal Policy Optimization</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -312,10 +263,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../../other/dfp/index.html" class="btn btn-neutral float-right" title="Direct Future Prediction"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../../other/dfp/" class="btn btn-neutral float-right" title="Direct Future Prediction">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../ppo/index.html" class="btn btn-neutral" title="Proximal Policy Optimization"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../ppo/" class="btn btn-neutral" title="Proximal Policy Optimization"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -329,7 +280,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -337,17 +288,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../ppo/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../ppo/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../../other/dfp/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../../other/dfp/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/policy_optimization/ddpg/index.html b/docs/algorithms/policy_optimization/ddpg/index.html
index 4447f02..49ba6bd 100644
--- a/docs/algorithms/policy_optimization/ddpg/index.html
+++ b/docs/algorithms/policy_optimization/ddpg/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Deep Determinstic Policy Gradients - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Deep Determinstic Policy Gradients - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Deep Determinstic Policy Gradients";
+    var mkdocs_page_input_path = "algorithms/policy_optimization/ddpg.md";
+    var mkdocs_page_url = "/algorithms/policy_optimization/ddpg/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Deep Determinstic Policy Gradients</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#deep-deterministic-policy-gradient">Deep Deterministic Policy Gradient</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ac/">Actor-Critic</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Deep Determinstic Policy Gradients</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#deep-deterministic-policy-gradient">Deep Deterministic Policy Gradient</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -310,10 +261,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../ppo/index.html" class="btn btn-neutral float-right" title="Proximal Policy Optimization"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../ppo/" class="btn btn-neutral float-right" title="Proximal Policy Optimization">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../ac/index.html" class="btn btn-neutral" title="Actor-Critic"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../ac/" class="btn btn-neutral" title="Actor-Critic"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -327,7 +278,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -335,17 +286,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../ac/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../ac/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../ppo/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../ppo/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/policy_optimization/pg/index.html b/docs/algorithms/policy_optimization/pg/index.html
index 6a1a6e3..777b434 100644
--- a/docs/algorithms/policy_optimization/pg/index.html
+++ b/docs/algorithms/policy_optimization/pg/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Policy Gradient - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Policy Gradient - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Policy Gradient";
+    var mkdocs_page_input_path = "algorithms/policy_optimization/pg.md";
+    var mkdocs_page_url = "/algorithms/policy_optimization/pg/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Policy Gradient</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#policy-gradient">Policy Gradient</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Policy Gradient</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#policy-gradient">Policy Gradient</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -302,10 +253,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../ac/index.html" class="btn btn-neutral float-right" title="Actor-Critic"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../ac/" class="btn btn-neutral float-right" title="Actor-Critic">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../../value_optimization/naf/index.html" class="btn btn-neutral" title="Normalized Advantage Functions"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../../value_optimization/naf/" class="btn btn-neutral" title="Normalized Advantage Functions"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -319,7 +270,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -327,17 +278,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../../value_optimization/naf/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../../value_optimization/naf/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../ac/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../ac/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/policy_optimization/ppo/index.html b/docs/algorithms/policy_optimization/ppo/index.html
index a0bb2d4..51ca06e 100644
--- a/docs/algorithms/policy_optimization/ppo/index.html
+++ b/docs/algorithms/policy_optimization/ppo/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Proximal Policy Optimization - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Proximal Policy Optimization - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Proximal Policy Optimization";
+    var mkdocs_page_input_path = "algorithms/policy_optimization/ppo.md";
+    var mkdocs_page_url = "/algorithms/policy_optimization/ppo/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Proximal Policy Optimization</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#proximal-policy-optimization">Proximal Policy Optimization</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Proximal Policy Optimization</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#proximal-policy-optimization">Proximal Policy Optimization</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -303,10 +254,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../cppo/index.html" class="btn btn-neutral float-right" title="Clipped Proximal Policy Optimization"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../cppo/" class="btn btn-neutral float-right" title="Clipped Proximal Policy Optimization">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../ddpg/index.html" class="btn btn-neutral" title="Deep Determinstic Policy Gradients"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../ddpg/" class="btn btn-neutral" title="Deep Determinstic Policy Gradients"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -320,7 +271,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -328,17 +279,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../ddpg/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../ddpg/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../cppo/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../cppo/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/bs_dqn/index.html b/docs/algorithms/value_optimization/bs_dqn/index.html
index 6a87fe3..e00e11d 100644
--- a/docs/algorithms/value_optimization/bs_dqn/index.html
+++ b/docs/algorithms/value_optimization/bs_dqn/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Bootstrapped DQN - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Bootstrapped DQN - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Bootstrapped DQN";
+    var mkdocs_page_input_path = "algorithms/value_optimization/bs_dqn.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/bs_dqn/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Bootstrapped DQN</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#bootstrapped-dqn">Bootstrapped DQN</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../nec/">Neural Episodic Control</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Bootstrapped DQN</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#bootstrapped-dqn">Bootstrapped DQN</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -304,10 +255,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../n_step/index.html" class="btn btn-neutral float-right" title="N-Step Q Learning"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../n_step/" class="btn btn-neutral float-right" title="N-Step Q Learning">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../nec/index.html" class="btn btn-neutral" title="Neural Episodic Control"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../nec/" class="btn btn-neutral" title="Neural Episodic Control"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -321,7 +272,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -329,17 +280,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../nec/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../nec/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../n_step/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../n_step/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/categorical_dqn/index.html b/docs/algorithms/value_optimization/categorical_dqn/index.html
index 9bda4db..602bd1a 100644
--- a/docs/algorithms/value_optimization/categorical_dqn/index.html
+++ b/docs/algorithms/value_optimization/categorical_dqn/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Categorical DQN - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Categorical DQN - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Categorical DQN";
+    var mkdocs_page_input_path = "algorithms/value_optimization/categorical_dqn.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/categorical_dqn/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Categorical DQN</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#categorical-dqn">Categorical DQN</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Categorical DQN</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#categorical-dqn">Categorical DQN</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -313,10 +264,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../mmc/index.html" class="btn btn-neutral float-right" title="Mixed Monte Carlo"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../mmc/" class="btn btn-neutral float-right" title="Mixed Monte Carlo">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../dueling_dqn/index.html" class="btn btn-neutral" title="Dueling DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../dueling_dqn/" class="btn btn-neutral" title="Dueling DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -330,7 +281,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -338,17 +289,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../dueling_dqn/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../dueling_dqn/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../mmc/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../mmc/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/double_dqn/index.html b/docs/algorithms/value_optimization/double_dqn/index.html
index 54e203c..ea94365 100644
--- a/docs/algorithms/value_optimization/double_dqn/index.html
+++ b/docs/algorithms/value_optimization/double_dqn/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Double DQN - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Double DQN - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Double DQN";
+    var mkdocs_page_input_path = "algorithms/value_optimization/double_dqn.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/double_dqn/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Double DQN</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#double-dqn">Double DQN</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../dqn/">DQN</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Double DQN</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#double-dqn">Double DQN</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -308,10 +259,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../dueling_dqn/index.html" class="btn btn-neutral float-right" title="Dueling DQN"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../dueling_dqn/" class="btn btn-neutral float-right" title="Dueling DQN">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../dqn/index.html" class="btn btn-neutral" title="DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../dqn/" class="btn btn-neutral" title="DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -325,7 +276,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -333,17 +284,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../dqn/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../dqn/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../dueling_dqn/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../dueling_dqn/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/dqn/index.html b/docs/algorithms/value_optimization/dqn/index.html
index d2199af..ea2adc0 100644
--- a/docs/algorithms/value_optimization/dqn/index.html
+++ b/docs/algorithms/value_optimization/dqn/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>DQN - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>DQN - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "DQN";
+    var mkdocs_page_input_path = "algorithms/value_optimization/dqn.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/dqn/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">DQN</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#deep-q-networks">Deep Q Networks</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class=" current">
+                    
+    <a class="current" href="./">DQN</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#deep-q-networks">Deep Q Networks</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -307,10 +258,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../double_dqn/index.html" class="btn btn-neutral float-right" title="Double DQN"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../double_dqn/" class="btn btn-neutral float-right" title="Double DQN">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../../../usage/index.html" class="btn btn-neutral" title="Usage"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../../../design/filters/" class="btn btn-neutral" title="Filters"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -324,7 +275,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -332,17 +283,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../../../usage/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../../../design/filters/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../double_dqn/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../double_dqn/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/dueling_dqn/index.html b/docs/algorithms/value_optimization/dueling_dqn/index.html
index c4c2df1..199ba06 100644
--- a/docs/algorithms/value_optimization/dueling_dqn/index.html
+++ b/docs/algorithms/value_optimization/dueling_dqn/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Dueling DQN - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Dueling DQN - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Dueling DQN";
+    var mkdocs_page_input_path = "algorithms/value_optimization/dueling_dqn.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/dueling_dqn/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Dueling DQN</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#dueling-dqn">Dueling DQN</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#general-description">General Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../double_dqn/">Double DQN</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Dueling DQN</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#dueling-dqn">Dueling DQN</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#general-description">General Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -297,10 +248,10 @@ This is especially important in environments where there are many actions to cho
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../categorical_dqn/index.html" class="btn btn-neutral float-right" title="Categorical DQN"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../categorical_dqn/" class="btn btn-neutral float-right" title="Categorical DQN">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../double_dqn/index.html" class="btn btn-neutral" title="Double DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../double_dqn/" class="btn btn-neutral" title="Double DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -314,7 +265,7 @@ This is especially important in environments where there are many actions to cho
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -322,17 +273,22 @@ This is especially important in environments where there are many actions to cho
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../double_dqn/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../double_dqn/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../categorical_dqn/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../categorical_dqn/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/mmc/index.html b/docs/algorithms/value_optimization/mmc/index.html
index 6eb0664..8668757 100644
--- a/docs/algorithms/value_optimization/mmc/index.html
+++ b/docs/algorithms/value_optimization/mmc/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Mixed Monte Carlo - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Mixed Monte Carlo - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Mixed Monte Carlo";
+    var mkdocs_page_input_path = "algorithms/value_optimization/mmc.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/mmc/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Mixed Monte Carlo</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#mixed-monte-carlo">Mixed Monte Carlo</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Mixed Monte Carlo</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#mixed-monte-carlo">Mixed Monte Carlo</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -309,10 +260,10 @@ Once in every few thousand steps, copy the weights from the online network to th
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../pal/index.html" class="btn btn-neutral float-right" title="Persistent Advantage Learning"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../pal/" class="btn btn-neutral float-right" title="Persistent Advantage Learning">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../categorical_dqn/index.html" class="btn btn-neutral" title="Categorical DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../categorical_dqn/" class="btn btn-neutral" title="Categorical DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -326,7 +277,7 @@ Once in every few thousand steps, copy the weights from the online network to th
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -334,17 +285,22 @@ Once in every few thousand steps, copy the weights from the online network to th
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../categorical_dqn/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../categorical_dqn/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../pal/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../pal/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/n_step/index.html b/docs/algorithms/value_optimization/n_step/index.html
index df4e12e..3f848cb 100644
--- a/docs/algorithms/value_optimization/n_step/index.html
+++ b/docs/algorithms/value_optimization/n_step/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>N-Step Q Learning - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>N-Step Q Learning - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "N-Step Q Learning";
+    var mkdocs_page_input_path = "algorithms/value_optimization/n_step.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/n_step/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">N-Step Q Learning</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#n-step-q-learning">N-Step Q Learning</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">N-Step Q Learning</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#n-step-q-learning">N-Step Q Learning</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -308,10 +259,10 @@ where <script type="math/tex">k</script> is <script type="math/tex">T_{max} - St
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../naf/index.html" class="btn btn-neutral float-right" title="Normalized Advantage Functions"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../naf/" class="btn btn-neutral float-right" title="Normalized Advantage Functions">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../bs_dqn/index.html" class="btn btn-neutral" title="Bootstrapped DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../bs_dqn/" class="btn btn-neutral" title="Bootstrapped DQN"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -325,7 +276,7 @@ where <script type="math/tex">k</script> is <script type="math/tex">T_{max} - St
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -333,17 +284,22 @@ where <script type="math/tex">k</script> is <script type="math/tex">T_{max} - St
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../bs_dqn/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../bs_dqn/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../naf/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../naf/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/naf/index.html b/docs/algorithms/value_optimization/naf/index.html
index bdbcda8..d67e211 100644
--- a/docs/algorithms/value_optimization/naf/index.html
+++ b/docs/algorithms/value_optimization/naf/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Normalized Advantage Functions - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Normalized Advantage Functions - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Normalized Advantage Functions";
+    var mkdocs_page_input_path = "algorithms/value_optimization/naf.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/naf/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Normalized Advantage Functions</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#normalized-advantage-functions">Normalized Advantage Functions</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../n_step/">N-Step Q Learning</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Normalized Advantage Functions</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#normalized-advantage-functions">Normalized Advantage Functions</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -300,10 +251,10 @@ After every training step, use a soft update in order to copy the weights from t
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../../policy_optimization/pg/index.html" class="btn btn-neutral float-right" title="Policy Gradient"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../../policy_optimization/pg/" class="btn btn-neutral float-right" title="Policy Gradient">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../n_step/index.html" class="btn btn-neutral" title="N-Step Q Learning"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../n_step/" class="btn btn-neutral" title="N-Step Q Learning"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -317,7 +268,7 @@ After every training step, use a soft update in order to copy the weights from t
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -325,17 +276,22 @@ After every training step, use a soft update in order to copy the weights from t
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../n_step/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../n_step/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../../policy_optimization/pg/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../../policy_optimization/pg/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/nec/index.html b/docs/algorithms/value_optimization/nec/index.html
index e853910..ee9fb8b 100644
--- a/docs/algorithms/value_optimization/nec/index.html
+++ b/docs/algorithms/value_optimization/nec/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Neural Episodic Control - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Neural Episodic Control - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Neural Episodic Control";
+    var mkdocs_page_input_path = "algorithms/value_optimization/nec.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/nec/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Neural Episodic Control</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#neural-episodic-control">Neural Episodic Control</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Neural Episodic Control</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#neural-episodic-control">Neural Episodic Control</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -307,10 +258,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../bs_dqn/index.html" class="btn btn-neutral float-right" title="Bootstrapped DQN"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../bs_dqn/" class="btn btn-neutral float-right" title="Bootstrapped DQN">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../pal/index.html" class="btn btn-neutral" title="Persistent Advantage Learning"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../pal/" class="btn btn-neutral" title="Persistent Advantage Learning"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -324,7 +275,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -332,17 +283,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../pal/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../pal/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../bs_dqn/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../bs_dqn/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/algorithms/value_optimization/pal/index.html b/docs/algorithms/value_optimization/pal/index.html
index 1c4d845..2c39a13 100644
--- a/docs/algorithms/value_optimization/pal/index.html
+++ b/docs/algorithms/value_optimization/pal/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Persistent Advantage Learning - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../../img/favicon.ico">
-
-  
+  <title>Persistent Advantage Learning - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../../css/highlight.css">
   <link href="../../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Persistent Advantage Learning";
+    var mkdocs_page_input_path = "algorithms/value_optimization/pal.md";
+    var mkdocs_page_url = "/algorithms/value_optimization/pal/";
   </script>
   
   <script src="../../../js/jquery-2.1.1.min.js"></script>
   <script src="../../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../../js/highlight.pack.js"></script>
-  <script src="../../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,195 +45,150 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Persistent Advantage Learning</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#persistent-advantage-learning">Persistent Advantage Learning</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
-                
-                    <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
-                
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Persistent Advantage Learning</a>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_agent/index.html">Adding a New Agent</a>
+    <li class="toctree-l3"><a href="#persistent-advantage-learning">Persistent Advantage Learning</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l4" href="#network-structure">Network Structure</a></li>
+        
+            <li><a class="toctree-l4" href="#algorithm-description">Algorithm Description</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -249,7 +200,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -321,10 +272,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../nec/index.html" class="btn btn-neutral float-right" title="Neural Episodic Control"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../nec/" class="btn btn-neutral float-right" title="Neural Episodic Control">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../mmc/index.html" class="btn btn-neutral" title="Mixed Monte Carlo"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../mmc/" class="btn btn-neutral" title="Mixed Monte Carlo"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -338,7 +289,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -346,17 +297,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../mmc/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../mmc/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../nec/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../nec/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../../..';</script>
+    <script src="../../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../../search/require.js"></script>
+      <script src="../../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/contributing/add_agent/index.html b/docs/contributing/add_agent/index.html
index 4674df9..89fd56e 100644
--- a/docs/contributing/add_agent/index.html
+++ b/docs/contributing/add_agent/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Adding a New Agent - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../img/favicon.ico">
-
-  
+  <title>Adding a New Agent - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../css/highlight.css">
   <link href="../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Adding a New Agent";
+    var mkdocs_page_input_path = "contributing/add_agent.md";
+    var mkdocs_page_url = "/contributing/add_agent/";
   </script>
   
   <script src="../../js/jquery-2.1.1.min.js"></script>
   <script src="../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../js/highlight.pack.js"></script>
-  <script src="../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,188 +45,139 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Adding a New Agent</a>
-        
-            <ul>
-            
-            </ul>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
     </ul>
-<li>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class=" current">
+                    
+    <a class="current" href="./">Adding a New Agent</a>
+    <ul class="subnav">
+            
+    </ul>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -242,7 +189,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -273,42 +220,72 @@
 <p>Coach's modularity makes adding an agent a simple and clean task, that involves the following steps:</p>
 <ol>
 <li>
-<p>Implement your algorithm in a new file under the agents directory. The agent can inherit base classes such as <strong>ValueOptimizationAgent</strong> or <strong>ActorCriticAgent</strong>, or the more generic <strong>Agent</strong> base class.</p>
+<p>Implement your algorithm in a new file. The agent can inherit base classes such as <strong>ValueOptimizationAgent</strong> or
+    <strong>ActorCriticAgent</strong>, or the more generic <strong>Agent</strong> base class.</p>
 <ul>
-<li>
-<p><strong>ValueOptimizationAgent</strong>, <strong>PolicyOptimizationAgent</strong> and <strong>Agent</strong> are abstract classes. 
-learn_from_batch() should be overriden with the desired behavior for the algorithm being implemented. If deciding to inherit from <strong>Agent</strong>, also choose_action() should be overriden.       </p>
-<pre><code>def learn_from_batch(self, batch):
+<li><strong>ValueOptimizationAgent</strong>, <strong>PolicyOptimizationAgent</strong> and <strong>Agent</strong> are abstract classes. 
+learn_from_batch() should be overriden with the desired behavior for the algorithm being implemented.
+If deciding to inherit from <strong>Agent</strong>, also choose_action() should be overriden.<pre><code>def learn_from_batch(self, batch) -&gt; Tuple[float, List, List]:
     """
     Given a batch of transitions, calculates their target values and updates the network.
     :param batch: A list of transitions
-    :return: The loss of the training
+    :return: The total loss of the training, the loss per head and the unclipped gradients
     """
-    pass
 
-def choose_action(self, curr_state, phase=RunPhase.TRAIN):
+def choose_action(self, curr_state):
     """
     choose an action to act with in the current episode being played. Different behavior might be exhibited when training
      or testing.
 
-    :param curr_state: the current state to act upon.  
-    :param phase: the current phase: training or testing.
+    :param curr_state: the current state to act upon.
     :return: chosen action, some action value describing the action (q-value, probability, etc)
     """
-    pass
 </code></pre>
 </li>
-<li>
-<p>Make sure to add your new agent to <strong>agents/__init__.py</strong></p>
-</li>
 </ul>
 </li>
 <li>
-<p>Implement your agent's specific network head, if needed, at the implementation for the framework of your choice. For example <strong>architectures/neon_components/heads.py</strong>. The head will inherit the generic base class Head.
-    A new output type should be added to configurations.py, and a mapping between the new head and output type should be defined in the get_output_head() function at <strong>architectures/neon_components/general_network.py</strong></p>
+<p>Implement your agent's specific network head, if needed, at the implementation for the framework of your choice.
+    For example <strong>architectures/neon_components/heads.py</strong>. The head will inherit the generic base class Head.
+    A new output type should be added to configurations.py, and a mapping between the new head and output type should
+    be defined in the get_output_head() function at <strong>architectures/neon_components/general_network.py</strong></p>
+</li>
+<li>
+<p>Define a new parameters class that inherits AgentParameters.
+    The parameters class defines all the hyperparameters for the agent, and is initialized with 4 main components:</p>
+<ul>
+<li><strong>algorithm</strong>: A class inheriting AlgorithmParameters which defines any algorithm specific parameters</li>
+<li><strong>exploration</strong>: A class inheriting ExplorationParameters which defines the exploration policy parameters.
+               There are several common exploration policies built-in which you can use, and are defined under
+               the exploration sub directory. You can also define your own custom exploration policy.</li>
+<li><strong>memory</strong>: A class inheriting MemoryParameters which defined the memory parameters.
+          There are several common memory types built-in which you can use, and are defined under the memories
+          sub directory. You can also define your own custom memory.</li>
+<li><strong>networks</strong>: A dictionary defining all the networks that will be used by the agent. The keys of the dictionary
+            define the network name and will be used to access each network through the agent class.
+            The dictionary values are a class inheriting NetworkParameters, which define the network structure
+            and parameters.</li>
+</ul>
+<p>Additionally, set the path property to return the path to your agent class in the following format:</p>
+<pre><code>    &lt;path to python module&gt;:&lt;name of agent class&gt;
+</code></pre>
+<p>For example,</p>
+<pre><code>    class RainbowAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=RainbowAlgorithmParameters(),
+                         exploration=RainbowExplorationParameters(),
+                         memory=RainbowMemoryParameters(),
+                         networks={"main": RainbowNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rainbow.rainbow_agent:RainbowAgent'
+</code></pre>
+</li>
+<li>
+<p>(Optional) Define a preset using the new agent type with a given environment, and the hyper-parameters that should
+    be used for training on that environment.</p>
 </li>
-<li>Define a new configuration class at configurations.py, which includes the new agent name in the <strong>type</strong> field, the new output type in the <strong>output_types</strong> field, and assigning default values to hyperparameters.</li>
-<li>(Optional) Define a preset using the new agent type with a given environment, and the hyperparameters that should be used for training on that environment.</li>
 </ol>
               
             </div>
@@ -317,10 +294,10 @@ def choose_action(self, curr_state, phase=RunPhase.TRAIN):
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../add_env/index.html" class="btn btn-neutral float-right" title="Adding a New Environment"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../add_env/" class="btn btn-neutral float-right" title="Adding a New Environment">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../../dashboard/index.html" class="btn btn-neutral" title="Coach Dashboard"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../../dashboard/" class="btn btn-neutral" title="Coach Dashboard"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -334,7 +311,7 @@ def choose_action(self, curr_state, phase=RunPhase.TRAIN):
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -342,17 +319,22 @@ def choose_action(self, curr_state, phase=RunPhase.TRAIN):
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../../dashboard/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../../dashboard/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../add_env/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../add_env/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '../..';</script>
+    <script src="../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../search/require.js"></script>
+      <script src="../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/contributing/add_env/index.html b/docs/contributing/add_env/index.html
index 9368553..d285c9d 100644
--- a/docs/contributing/add_env/index.html
+++ b/docs/contributing/add_env/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Adding a New Environment - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../../img/favicon.ico">
-
-  
+  <title>Adding a New Environment - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../../css/highlight.css">
   <link href="../../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Adding a New Environment";
+    var mkdocs_page_input_path = "contributing/add_env.md";
+    var mkdocs_page_url = "/contributing/add_env/";
   </script>
   
   <script src="../../js/jquery-2.1.1.min.js"></script>
   <script src="../../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../../js/highlight.pack.js"></script>
-  <script src="../../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../.." class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="../.." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,188 +45,145 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../..">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../../algorithms/imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../add_agent/index.html">Adding a New Agent</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Adding a New Environment</a>
-        
-            <ul>
-            
-            </ul>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
     </ul>
-<li>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../add_agent/">Adding a New Agent</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Adding a New Environment</a>
+    <ul class="subnav">
+            
+    <li class="toctree-l3"><a href="#using-the-openai-gym-api">Using the OpenAI Gym API</a></li>
+    
+
+    <li class="toctree-l3"><a href="#using-the-coach-api">Using the Coach API</a></li>
+    
+
+    </ul>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -242,7 +195,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../..">Reinforcement Learning Coach Documentation</a>
+        <a href="../..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -269,74 +222,81 @@
             <div class="section">
               
                 <p>Adding a new environment to Coach is as easy as solving CartPole. </p>
+<p>There are essentially two ways to integrate new environments to Coach:</p>
+<h2 id="using-the-openai-gym-api">Using the OpenAI Gym API</h2>
+<p>If your environment is already using the OpenAI Gym API, you are already good to go.
+When selecting the environment parameters in the preset, use GymEnvironmentParameters(),
+and pass the path to your environment source code using the level parameter.
+You can specify additional parameters for your environment using the additional_simulator_parameters parameter.
+Take for example the definition used in the Pendulum_HAC preset:</p>
+<pre><code>    env_params = GymEnvironmentParameters()
+    env_params.level = "rl_coach.environments.mujoco.pendulum_with_goals:PendulumWithGoals"
+    env_params.additional_simulator_parameters = {"time_limit": 1000}
+</code></pre>
+<h2 id="using-the-coach-api">Using the Coach API</h2>
 <p>There are a few simple steps to follow, and we will walk through them one by one.</p>
 <ol>
 <li>
-<p>Coach defines a simple API for implementing a new environment which is defined in environment/environment_wrapper.py.
-    There are several functions to implement, but only some of them are mandatory. </p>
+<p>Create a new class for your environment, and inherit the Environment class.</p>
+</li>
+<li>
+<p>Coach defines a simple API for implementing a new environment, which are defined in environment/environment.py.
+    There are several functions to implement, but only some of them are mandatory.</p>
 <p>Here are the important ones:</p>
-<pre><code>    def _take_action(self, action_idx):
+<pre><code>    def _take_action(self, action_idx: ActionType) -&gt; None:
         """
         An environment dependent function that sends an action to the simulator.
-        :param action_idx: the action to perform on the environment.
+        :param action_idx: the action to perform on the environment
         :return: None
         """
-        pass
 
-    def _preprocess_observation(self, observation):
-        """
-        Do initial observation preprocessing such as cropping, rgb2gray, rescale etc.
-        Implementing this function is optional.
-        :param observation: a raw observation from the environment
-        :return: the preprocessed observation
-        """
-        return observation
-
-    def _update_state(self):
+    def _update_state(self) -&gt; None:
         """
         Updates the state from the environment.
         Should update self.observation, self.reward, self.done, self.measurements and self.info
         :return: None
         """
-        pass
 
-    def _restart_environment_episode(self, force_environment_reset=False):
+    def _restart_environment_episode(self, force_environment_reset=False) -&gt; None:
         """
+        Restarts the simulator episode
         :param force_environment_reset: Force the environment to reset even if the episode is not done yet.
-        :return:
+        :return: None
         """
-        pass
 
-    def get_rendered_image(self):
+    def _render(self) -&gt; None:
+        """
+        Renders the environment using the native simulator renderer
+        :return: None
+        """
+
+    def get_rendered_image(self) -&gt; np.ndarray:
         """
         Return a numpy array containing the image that will be rendered to the screen.
         This can be different from the observation. For example, mujoco's observation is a measurements vector.
         :return: numpy array containing the image that will be rendered to the screen
         """
-        return self.observation
 </code></pre>
 </li>
 <li>
-<p>Make sure to import the environment in environments/__init__.py:</p>
-<pre><code>from doom_environment_wrapper import *
-</code></pre>
-<p>Also, a new entry should be added to the EnvTypes enum mapping the environment name to the wrapper's class name:</p>
-<pre><code>Doom = "DoomEnvironmentWrapper"
+<p>Create a new parameters class for your environment, which inherits the EnvironmentParameters class.
+    In the <strong>init</strong> of your class, define all the parameters you used in your Environment class.
+    Additionally, fill the path property of the class with the path to your Environment class.
+    For example, take a look at the EnvironmentParameters class used for Doom:</p>
+<pre><code>    class DoomEnvironmentParameters(EnvironmentParameters):
+    def __init__(self):
+        super().__init__()
+        self.default_input_filter = DoomInputFilter
+        self.default_output_filter = DoomOutputFilter
+        self.cameras = [DoomEnvironment.CameraTypes.OBSERVATION]
+
+    @property
+    def path(self):
+        return 'rl_coach.environments.doom_environment:DoomEnvironment'
 </code></pre>
 </li>
 <li>
-<p>In addition a new configuration class should be implemented for defining the environment's parameters and placed in configurations.py. 
-For instance, the following is used for Doom:</p>
-<pre><code>class Doom(EnvironmentParameters):
-    type = 'Doom'
-    frame_skip = 4
-    observation_stack_size = 3
-    desired_observation_height = 60
-    desired_observation_width = 76
-</code></pre>
-</li>
-<li>
-<p>And that's it, you're done. Now just add a new preset with your newly created environment, and start training an agent on top of it. </p>
+<p>And that's it, you're done. Now just add a new preset with your newly created environment, and start training an agent on top of it.</p>
 </li>
 </ol>
               
@@ -347,7 +307,7 @@ For instance, the following is used for Doom:</p>
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
       
-        <a href="../add_agent/index.html" class="btn btn-neutral" title="Adding a New Agent"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../add_agent/" class="btn btn-neutral" title="Adding a New Agent"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -361,7 +321,7 @@ For instance, the following is used for Doom:</p>
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -369,15 +329,20 @@ For instance, the following is used for Doom:</p>
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../add_agent/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../add_agent/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
     </span>
 </div>
+    <script>var base_url = '../..';</script>
+    <script src="../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../search/require.js"></script>
+      <script src="../../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/css/highlight.css b/docs/css/highlight.css
index 0375453..0ae40a7 100644
--- a/docs/css/highlight.css
+++ b/docs/css/highlight.css
@@ -8,7 +8,6 @@ github.com style (c) Vasily Polovnyov <vast@whiteants.net>
 .hljs {
   display: block;
   overflow-x: auto;
-  padding: 0.5em;
   color: #333;
   -webkit-text-size-adjust: none;
 }
diff --git a/docs/css/theme.css b/docs/css/theme.css
index 3e564a4..099a2d8 100644
--- a/docs/css/theme.css
+++ b/docs/css/theme.css
@@ -3,7 +3,7 @@
  * theme. To aid upgradability this file should *not* be edited.
  * modifications we need should be included in theme_extra.css.
  *
- * https://github.com/rtfd/readthedocs.org/blob/master/media/css/sphinx_rtd_theme.css
+ * https://github.com/rtfd/readthedocs.org/blob/master/readthedocs/core/static/core/css/theme.css
  */
 
 *{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}article,aside,details,figcaption,figure,footer,header,hgroup,nav,section{display:block}audio,canvas,video{display:inline-block;*display:inline;*zoom:1}audio:not([controls]){display:none}[hidden]{display:none}*{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}html{font-size:100%;-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%}body{margin:0}a:hover,a:active{outline:0}abbr[title]{border-bottom:1px dotted}b,strong{font-weight:bold}blockquote{margin:0}dfn{font-style:italic}ins{background:#ff9;color:#000;text-decoration:none}mark{background:#ff0;color:#000;font-style:italic;font-weight:bold}pre,code,.rst-content tt,kbd,samp{font-family:monospace,serif;_font-family:"courier new",monospace;font-size:1em}pre{white-space:pre}q{quotes:none}q:before,q:after{content:"";content:none}small{font-size:85%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sup{top:-0.5em}sub{bottom:-0.25em}ul,ol,dl{margin:0;padding:0;list-style:none;list-style-image:none}li{list-style:none}dd{margin:0}img{border:0;-ms-interpolation-mode:bicubic;vertical-align:middle;max-width:100%}svg:not(:root){overflow:hidden}figure{margin:0}form{margin:0}fieldset{border:0;margin:0;padding:0}label{cursor:pointer}legend{border:0;*margin-left:-7px;padding:0;white-space:normal}button,input,select,textarea{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle}button,input{line-height:normal}button,input[type="button"],input[type="reset"],input[type="submit"]{cursor:pointer;-webkit-appearance:button;*overflow:visible}button[disabled],input[disabled]{cursor:default}input[type="checkbox"],input[type="radio"]{box-sizing:border-box;padding:0;*width:13px;*height:13px}input[type="search"]{-webkit-appearance:textfield;-moz-box-sizing:content-box;-webkit-box-sizing:content-box;box-sizing:content-box}input[type="search"]::-webkit-search-decoration,input[type="search"]::-webkit-search-cancel-button{-webkit-appearance:none}button::-moz-focus-inner,input::-moz-focus-inner{border:0;padding:0}textarea{overflow:auto;vertical-align:top;resize:vertical}table{border-collapse:collapse;border-spacing:0}td{vertical-align:top}.chromeframe{margin:0.2em 0;background:#ccc;color:#000;padding:0.2em 0}.ir{display:block;border:0;text-indent:-999em;overflow:hidden;background-color:transparent;background-repeat:no-repeat;text-align:left;direction:ltr;*line-height:0}.ir br{display:none}.hidden{display:none !important;visibility:hidden}.visuallyhidden{border:0;clip:rect(0 0 0 0);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}.visuallyhidden.focusable:active,.visuallyhidden.focusable:focus{clip:auto;height:auto;margin:0;overflow:visible;position:static;width:auto}.invisible{visibility:hidden}.relative{position:relative}big,small{font-size:100%}@media print{html,body,section{background:none !important}*{box-shadow:none !important;text-shadow:none !important;filter:none !important;-ms-filter:none !important}a,a:visited{text-decoration:underline}.ir a:after,a[href^="javascript:"]:after,a[href^="#"]:after{content:""}pre,blockquote{page-break-inside:avoid}thead{display:table-header-group}tr,img{page-break-inside:avoid}img{max-width:100% !important}@page{margin:0.5cm}p,h2,h3{orphans:3;widows:3}h2,h3{page-break-after:avoid}}.fa:before,.rst-content .admonition-title:before,.rst-content h1 .headerlink:before,.rst-content h2 .headerlink:before,.rst-content h3 .headerlink:before,.rst-content h4 .headerlink:before,.rst-content h5 .headerlink:before,.rst-content h6 .headerlink:before,.rst-content dl dt .headerlink:before,.icon:before,.wy-dropdown .caret:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-alert,.rst-content .note,.rst-content .attention,.rst-content .caution,.rst-content .danger,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .tip,.rst-content .warning,.rst-content .seealso,.rst-content .admonition-todo,.btn,input[type="text"],input[type="password"],input[type="email"],input[type="url"],input[type="date"],input[type="month"],input[type="time"],input[type="datetime"],input[type="datetime-local"],input[type="week"],input[type="number"],input[type="search"],input[type="tel"],input[type="color"],select,textarea,.wy-menu-vertical li.on a,.wy-menu-vertical li.current>a,.wy-side-nav-search>a,.wy-side-nav-search .wy-dropdown>a,.wy-nav-top a{-webkit-font-smoothing:antialiased}.clearfix{*zoom:1}.clearfix:before,.clearfix:after{display:table;content:""}.clearfix:after{clear:both}/*!
diff --git a/docs/css/theme_extra.css b/docs/css/theme_extra.css
index ccb384e..cf8123e 100644
--- a/docs/css/theme_extra.css
+++ b/docs/css/theme_extra.css
@@ -1,15 +1,3 @@
-/*
- * Tweak the overal size to better match RTD.
- */
-body {
-    font-size: 90%;
-}
-
-h3, h4, h5, h6 {
-    color: #2980b9;
-    font-weight: 300
-}
-
 /*
  * Sphinx doesn't have support for section dividers like we do in
  * MkDocs, this styles the section titles in the nav
@@ -34,10 +22,25 @@ h3, h4, h5, h6 {
  * area doesn't scroll.
  *
  * https://github.com/mkdocs/mkdocs/pull/202
+ *
+ * Builds upon pull 202 https://github.com/mkdocs/mkdocs/pull/202
+ * to make toc scrollbar end before navigations buttons to not be overlapping.
  */
 .wy-nav-side {
-    height: 100%;
+    height: calc(100% - 45px);
     overflow-y: auto;
+    min-height: 0;
+}
+
+.rst-versions{
+    border-top: 0;
+    height: 45px;
+}
+
+@media screen and (max-width: 768px) {
+    .wy-nav-side {
+        height: 100%;
+    }
 }
 
 /*
@@ -50,23 +53,49 @@ h3, h4, h5, h6 {
   margin-bottom: 2em;
 }
 
-/*
- * Fix wrapping in the code highlighting
- *
- * https://github.com/mkdocs/mkdocs/issues/233
- */
-code {
-    white-space: pre;
-}
-
 /*
  * Wrap inline code samples otherwise they shoot of the side and
  * can't be read at all.
  *
  * https://github.com/mkdocs/mkdocs/issues/313
+ * https://github.com/mkdocs/mkdocs/issues/233
+ * https://github.com/mkdocs/mkdocs/issues/834
  */
-p code {
+code {
+    white-space: pre-wrap;
     word-wrap: break-word;
+    padding: 2px 5px;
+}
+
+/**
+ * Make code blocks display as blocks and give them the appropriate
+ * font size and padding.
+ *
+ * https://github.com/mkdocs/mkdocs/issues/855
+ * https://github.com/mkdocs/mkdocs/issues/834
+ * https://github.com/mkdocs/mkdocs/issues/233
+ */
+pre code {
+  white-space: pre;
+  word-wrap: normal;
+  display: block;
+  padding: 12px;
+  font-size: 12px;
+}
+
+/*
+ * Fix link colors when the link text is inline code.
+ *
+ * https://github.com/mkdocs/mkdocs/issues/718
+ */
+a code {
+    color: #2980B9;
+}
+a:hover code {
+    color: #3091d1;
+}
+a:visited code {
+    color: #9B59B6;
 }
 
 /*
@@ -76,7 +105,7 @@ p code {
  *
  * https://github.com/mkdocs/mkdocs/issues/411
  */
-code.cs, code.c {
+pre .cs, pre .c {
     font-weight: inherit;
     font-style: inherit;
 }
@@ -99,21 +128,20 @@ code.cs, code.c {
  * Additions specific to the search functionality provided by MkDocs
  */
 
-#mkdocs-search-results article h3
-{
+.search-results article {
     margin-top: 23px;
     border-top: 1px solid #E1E4E5;
     padding-top: 24px;
 }
 
-#mkdocs-search-results article:first-child h3 {
+.search-results article:first-child {
     border-top: none;
 }
 
-#mkdocs-search-query{
+form .search-query {
     width: 100%;
     border-radius: 50px;
-    padding: 6px 12px;
+    padding: 6px 12px;  /* csslint allow: box-model */
     border-color: #D1D4D5;
 }
 
@@ -124,3 +152,43 @@ code.cs, code.c {
 .wy-menu-vertical li ul.subnav ul.subnav{
     padding-left: 1em;
 }
+
+.wy-menu-vertical .subnav li.current > a {
+    padding-left: 2.42em;
+}
+.wy-menu-vertical .subnav li.current > ul li a {
+    padding-left: 3.23em;
+}
+
+/*
+ * Improve inline code blocks within admonitions.
+ *
+ * https://github.com/mkdocs/mkdocs/issues/656
+ */
+ .admonition code {
+  color: #404040;
+  border: 1px solid #c7c9cb;
+  border: 1px solid rgba(0, 0, 0, 0.2);
+  background: #f8fbfd;
+  background: rgba(255, 255, 255, 0.7);
+}
+
+/*
+ * Account for wide tables which go off the side.
+ * Override borders to avoid wierdness on narrow tables.
+ *
+ * https://github.com/mkdocs/mkdocs/issues/834
+ * https://github.com/mkdocs/mkdocs/pull/1034
+ */
+.rst-content .section .docutils {
+    width: 100%;
+    overflow: auto;
+    display: block;
+    border: none;
+}
+
+td, th {
+   border: 1px solid #e1e4e5 !important; /* csslint allow: important */
+   border-collapse: collapse;
+}
+
diff --git a/docs/dashboard/index.html b/docs/dashboard/index.html
index beb8667..3befb06 100644
--- a/docs/dashboard/index.html
+++ b/docs/dashboard/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Coach Dashboard - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../img/favicon.ico">
-
-  
+  <title>Coach Dashboard - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../css/highlight.css">
   <link href="../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Coach Dashboard";
+    var mkdocs_page_input_path = "dashboard.md";
+    var mkdocs_page_url = "/dashboard/";
   </script>
   
   <script src="../js/jquery-2.1.1.min.js"></script>
   <script src="../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../js/highlight.pack.js"></script>
-  <script src="../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../index.html" class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href=".." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,197 +45,148 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../index.html">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="../usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Coach Dashboard</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#visualizing-signals">Visualizing Signals</a></li>
-                
-            
-                <li class="toctree-l3"><a href="#tracking-statistics">Tracking Statistics</a></li>
-                
-            
-                <li class="toctree-l3"><a href="#comparing-runs">Comparing Runs</a></li>
-                
-            
-            </ul>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../contributing/add_agent/index.html">Adding a New Agent</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
     </ul>
-<li>
+	    </li>
+          
+            <li class="toctree-l1 current">
+		
+    <a class="current" href="./">Coach Dashboard</a>
+    <ul class="subnav">
+            
+    <li class="toctree-l2"><a href="#visualizing-signals">Visualizing Signals</a></li>
+    
+
+    <li class="toctree-l2"><a href="#tracking-statistics">Tracking Statistics</a></li>
+    
+
+    <li class="toctree-l2"><a href="#comparing-runs">Comparing Runs</a></li>
+    
+
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -251,7 +198,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../index.html">Reinforcement Learning Coach Documentation</a>
+        <a href="..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -259,7 +206,7 @@
         <div class="rst-content">
           <div role="navigation" aria-label="breadcrumbs navigation">
   <ul class="wy-breadcrumbs">
-    <li><a href="../index.html">Docs</a> &raquo;</li>
+    <li><a href="..">Docs</a> &raquo;</li>
     
       
     
@@ -352,10 +299,10 @@
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../contributing/add_agent/index.html" class="btn btn-neutral float-right" title="Adding a New Agent"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../contributing/add_agent/" class="btn btn-neutral float-right" title="Adding a New Agent">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../algorithms/imitation/bc/index.html" class="btn btn-neutral" title="Behavioral Cloning"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href="../algorithms/imitation/bc/" class="btn btn-neutral" title="Behavioral Cloning"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -369,7 +316,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -377,17 +324,22 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../algorithms/imitation/bc/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href="../algorithms/imitation/bc/" style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../contributing/add_agent/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../contributing/add_agent/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '..';</script>
+    <script src="../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../search/require.js"></script>
+      <script src="../search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/design/control_flow/index.html b/docs/design/control_flow/index.html
new file mode 100644
index 0000000..7a05c48
--- /dev/null
+++ b/docs/design/control_flow/index.html
@@ -0,0 +1,367 @@
+<!DOCTYPE html>
+<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
+<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
+<head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  
+  
+  <link rel="shortcut icon" href="../../img/favicon.ico">
+  <title>Control Flow - Reinforcement Learning Coach</title>
+  <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
+
+  <link rel="stylesheet" href="../../css/theme.css" type="text/css" />
+  <link rel="stylesheet" href="../../css/theme_extra.css" type="text/css" />
+  <link rel="stylesheet" href="../../css/highlight.css">
+  <link href="../../extra.css" rel="stylesheet">
+  
+  <script>
+    // Current page data
+    var mkdocs_page_name = "Control Flow";
+    var mkdocs_page_input_path = "design/control_flow.md";
+    var mkdocs_page_url = "/design/control_flow/";
+  </script>
+  
+  <script src="../../js/jquery-2.1.1.min.js"></script>
+  <script src="../../js/modernizr-2.8.3.min.js"></script>
+  <script type="text/javascript" src="../../js/highlight.pack.js"></script> 
+  
+</head>
+
+<body class="wy-body-for-nav" role="document">
+
+  <div class="wy-grid-for-nav">
+
+    
+    <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
+      <div class="wy-side-nav-search">
+        <a href="../.." class="icon icon-home"> Reinforcement Learning Coach</a>
+        <div role="search">
+  <form id ="rtd-search-form" class="wy-form" action="../../search.html" method="get">
+    <input type="text" name="q" placeholder="Search docs" />
+  </form>
+</div>
+      </div>
+
+      <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
+	<ul class="current">
+	  
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../..">Home</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../usage/">Usage</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../features/">Features</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Control Flow</a>
+    <ul class="subnav">
+            
+    <li class="toctree-l3"><a href="#coach-control-flow">Coach Control Flow</a></li>
+    
+        <ul>
+        
+            <li><a class="toctree-l4" href="#graph-manager">Graph Manager</a></li>
+        
+            <li><a class="toctree-l4" href="#level-manager">Level Manager</a></li>
+        
+            <li><a class="toctree-l4" href="#agent">Agent</a></li>
+        
+        </ul>
+    
+
+    </ul>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../filters/">Filters</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
+          
+        </ul>
+      </div>
+      &nbsp;
+    </nav>
+
+    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
+
+      
+      <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
+        <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
+        <a href="../..">Reinforcement Learning Coach</a>
+      </nav>
+
+      
+      <div class="wy-nav-content">
+        <div class="rst-content">
+          <div role="navigation" aria-label="breadcrumbs navigation">
+  <ul class="wy-breadcrumbs">
+    <li><a href="../..">Docs</a> &raquo;</li>
+    
+      
+        
+          <li>Design &raquo;</li>
+        
+      
+    
+    <li>Control Flow</li>
+    <li class="wy-breadcrumbs-aside">
+      
+    </li>
+  </ul>
+  <hr/>
+</div>
+          <div role="main">
+            <div class="section">
+              
+                <!-- language-all: python -->
+
+<h1 id="coach-control-flow">Coach Control Flow</h1>
+<p>Coach is built in a modular way, encouraging modules reuse and reducing the amount of boilerplate code needed
+for developing new algorithms or integrating a new challenge as an environment.
+On the other hand, it can be overwhelming for new users to ramp up on the code.
+To help with that, here's a short overview of the control flow.</p>
+<h2 id="graph-manager">Graph Manager</h2>
+<p>The main entry point for Coach is <strong>coach.py</strong>.
+The main functionality of this script is to parse the command line arguments and invoke all the sub-processes needed
+for the given experiment.
+<strong>coach.py</strong> executes the given <strong>preset</strong> file which returns a <strong>GraphManager</strong> object.</p>
+<p>A <strong>preset</strong> is a design pattern that is intended for concentrating the entire definition of an experiment in a single
+file. This helps with experiments reproducibility, improves readability and prevents confusion.
+The outcome of a preset is a <strong>GraphManager</strong> which will usually be instantiated in the final lines of the preset.</p>
+<p>A <strong>GraphManager</strong> is an object that holds all the agents and environments of an experiment, and is mostly responsible
+for scheduling their work. Why is it called a <strong>graph</strong> manager? Because agents and environments are structured into
+a graph of interactions. For example, in hierarchical reinforcement learning schemes, there will often be a master
+policy agent, that will control a sub-policy agent, which will interact with the environment. Other schemes can have
+much more complex graphs of control, such as several hierarchy layers, each with multiple agents.
+The graph manager's main loop is the improve loop.</p>
+<p style="text-align: center;">
+
+<img src="../../img/improve.png" alt="Improve loop" style="width: 400px;"/>
+
+</p>
+
+<p>The improve loop skips between 3 main phases - heatup, training and evaluation:</p>
+<ul>
+<li>
+<p><strong>Heatup</strong> - the goal of this phase is to collect initial data for populating the replay buffers. The heatup phase
+  takes place only in the beginning of the experiment, and the agents will act completely randomly during this phase.
+  Importantly, the agents do not train their networks during this phase. DQN for example, uses 50k random steps in order
+  to initialize the replay buffers.</p>
+</li>
+<li>
+<p><strong>Training</strong> - the training phase is the main phase of the experiment. This phase can change between agent types,
+  but essentially consists of repeated cycles of acting, collecting data from the environment, and training the agent
+  networks. During this phase, the agent will use its exploration policy in training mode, which will add noise to its
+  actions in order to improve its knowledge about the environment state space.</p>
+</li>
+<li>
+<p><strong>Evaluation</strong> - the evaluation phase is intended for evaluating the current performance of the agent. The agents
+  will act greedily in order to exploit the knowledge aggregated so far and the performance over multiple episodes of
+  evaluation will be averaged in order to reduce the stochasticity effects of all the components.</p>
+</li>
+</ul>
+<h2 id="level-manager">Level Manager</h2>
+<p>In each of the 3 phases described above, the graph manager will invoke all the hierarchy levels in the graph in a
+synchronized manner. In Coach, agents do not interact directly with the environment. Instead, they go through a
+<em>LevelManager</em>, which is a proxy that manages their interaction. The level manager passes the current state and reward
+from the environment to the agent, and the actions from the agent to the environment.</p>
+<p>The motivation for having a level manager is to disentangle the code of the environment and the agent, so to allow more
+complex interactions. Each level can have multiple agents which interact with the environment. Who gets to choose the
+action for each step is controlled by the level manager.
+Additionally, each level manager can act as an environment for the hierarchy level above it, such that each hierarchy
+level can be seen as an interaction between an agent and an environment, even if the environment is just more agents in
+a lower hierarchy level.</p>
+<h2 id="agent">Agent</h2>
+<p>The base agent class has 3 main function that will be used during those phases - observe, act and train.</p>
+<ul>
+<li><strong>Observe</strong> - this function gets the latest response from the environment as input, and updates the internal state
+  of the agent with the new information. The environment response will
+  be first passed through the agent's <strong>InputFilter</strong> object, which will process the values in the response, according
+  to the specific agent definition. The environment response will then be converted into a
+  <strong>Transition</strong> which will contain the information from a single step
+  (<script type="math/tex"> s_{t}, a_{t}, r_{t}, s_{t+1}, terminal signal </script>), and store it in the memory.</li>
+</ul>
+<p><img src="../../img/observe.png" alt="Observe" style="width: 700px;"/></p>
+<ul>
+<li><strong>Act</strong> - this function uses the current internal state of the agent in order to select the next action to take on
+  the environment. This function will call the per-agent custom function <strong>choose_action</strong> that will use the network
+  and the exploration policy in order to select an action. The action will be stored, together with any additional
+  information (like the action value for example) in an <strong>ActionInfo</strong> object. The ActionInfo object will then be
+  passed through the agent's <strong>OutputFilter</strong> to allow any processing of the action (like discretization,
+  or shifting, for example), before passing it to the environment.</li>
+</ul>
+<p><img src="../../img/act.png" alt="Act" style="width: 700px;"/></p>
+<ul>
+<li><strong>Train</strong> - this function will sample a batch from the memory and train on it. The batch of transitions will be
+  first wrapped into a <strong>Batch</strong> object to allow efficient querying of the batch values. It will then be passed into
+  the agent specific <strong>learn_from_batch</strong> function, that will extract network target values from the batch and will
+  train the networks accordingly. Lastly, if there's a target network defined for the agent, it will sync the target
+  network weights with the online network.</li>
+</ul>
+<p><img src="../../img/train.png" alt="Train" style="width: 700px;"/></p>
+              
+            </div>
+          </div>
+          <footer>
+  
+    <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
+      
+        <a href="../network/" class="btn btn-neutral float-right" title="Network">Next <span class="icon icon-circle-arrow-right"></span></a>
+      
+      
+        <a href="../features/" class="btn btn-neutral" title="Features"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+      
+    </div>
+  
+
+  <hr/>
+
+  <div role="contentinfo">
+    <!-- Copyright etc -->
+    
+  </div>
+
+  Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
+</footer>
+      
+        </div>
+      </div>
+
+    </section>
+
+  </div>
+
+  <div class="rst-versions" role="note" style="cursor: pointer">
+    <span class="rst-current-version" data-toggle="rst-current-version">
+      
+      
+        <span><a href="../features/" style="color: #fcfcfc;">&laquo; Previous</a></span>
+      
+      
+        <span style="margin-left: 15px"><a href="../network/" style="color: #fcfcfc">Next &raquo;</a></span>
+      
+    </span>
+</div>
+    <script>var base_url = '../..';</script>
+    <script src="../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../search/require.js"></script>
+      <script src="../../search/search.js"></script>
+
+</body>
+</html>
diff --git a/docs/design/features/index.html b/docs/design/features/index.html
new file mode 100644
index 0000000..4ff66a9
--- /dev/null
+++ b/docs/design/features/index.html
@@ -0,0 +1,328 @@
+<!DOCTYPE html>
+<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
+<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
+<head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  
+  
+  <link rel="shortcut icon" href="../../img/favicon.ico">
+  <title>Features - Reinforcement Learning Coach</title>
+  <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
+
+  <link rel="stylesheet" href="../../css/theme.css" type="text/css" />
+  <link rel="stylesheet" href="../../css/theme_extra.css" type="text/css" />
+  <link rel="stylesheet" href="../../css/highlight.css">
+  <link href="../../extra.css" rel="stylesheet">
+  
+  <script>
+    // Current page data
+    var mkdocs_page_name = "Features";
+    var mkdocs_page_input_path = "design/features.md";
+    var mkdocs_page_url = "/design/features/";
+  </script>
+  
+  <script src="../../js/jquery-2.1.1.min.js"></script>
+  <script src="../../js/modernizr-2.8.3.min.js"></script>
+  <script type="text/javascript" src="../../js/highlight.pack.js"></script> 
+  
+</head>
+
+<body class="wy-body-for-nav" role="document">
+
+  <div class="wy-grid-for-nav">
+
+    
+    <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
+      <div class="wy-side-nav-search">
+        <a href="../.." class="icon icon-home"> Reinforcement Learning Coach</a>
+        <div role="search">
+  <form id ="rtd-search-form" class="wy-form" action="../../search.html" method="get">
+    <input type="text" name="q" placeholder="Search docs" />
+  </form>
+</div>
+      </div>
+
+      <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
+	<ul class="current">
+	  
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../..">Home</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../usage/">Usage</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
+    <ul class="subnav">
+                <li class=" current">
+                    
+    <a class="current" href="./">Features</a>
+    <ul class="subnav">
+            
+    <li class="toctree-l3"><a href="#coach-features">Coach Features</a></li>
+    
+        <ul>
+        
+            <li><a class="toctree-l4" href="#supported-algorithms">Supported Algorithms</a></li>
+        
+            <li><a class="toctree-l4" href="#supported-environments">Supported Environments</a></li>
+        
+        </ul>
+    
+
+    </ul>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../filters/">Filters</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
+          
+        </ul>
+      </div>
+      &nbsp;
+    </nav>
+
+    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
+
+      
+      <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
+        <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
+        <a href="../..">Reinforcement Learning Coach</a>
+      </nav>
+
+      
+      <div class="wy-nav-content">
+        <div class="rst-content">
+          <div role="navigation" aria-label="breadcrumbs navigation">
+  <ul class="wy-breadcrumbs">
+    <li><a href="../..">Docs</a> &raquo;</li>
+    
+      
+        
+          <li>Design &raquo;</li>
+        
+      
+    
+    <li>Features</li>
+    <li class="wy-breadcrumbs-aside">
+      
+    </li>
+  </ul>
+  <hr/>
+</div>
+          <div role="main">
+            <div class="section">
+              
+                <h1 id="coach-features">Coach Features</h1>
+<h2 id="supported-algorithms">Supported Algorithms</h2>
+<p>Coach supports many state-of-the-art reinforcement learning algorithms, which are separated into two main classes -
+value optimization and policy optimization. A detailed description of those algorithms may be found in the algorithms
+section.</p>
+<p style="text-align: center;">
+
+<img src="../../img/algorithms.png" alt="Supported Algorithms" style="width: 600px;"/>
+
+</p>
+
+<h2 id="supported-environments">Supported Environments</h2>
+<p>Coach supports a large number of environments which can be solved using reinforcement learning:</p>
+<ul>
+<li>
+<p><strong><a href="https://github.com/deepmind/dm_control">DeepMind Control Suite</a></strong> - a set of reinforcement learning environments
+  powered by the MuJoCo physics engine.</p>
+</li>
+<li>
+<p><strong><a href="https://github.com/deepmind/pysc2">Blizzard Starcraft II</a></strong> - a popular strategy game which was wrapped with a
+  python interface by DeepMind.</p>
+</li>
+<li>
+<p><strong><a href="http://vizdoom.cs.put.edu.pl/">ViZDoom</a></strong> - a Doom-based AI research platform for reinforcement learning
+  from raw visual information.</p>
+</li>
+<li>
+<p><strong><a href="https://github.com/carla-simulator/carla">CARLA</a></strong> - an open-source simulator for autonomous driving research.</p>
+</li>
+<li>
+<p><strong><a href="https://gym.openai.com/">OpenAI Gym</a></strong> - a library which consists of a set of environments, from games to robotics.
+  Additionally, it can be extended using the API defined by the authors.</p>
+</li>
+</ul>
+<p>In Coach, we support all the native environments in Gym, along with several extensions such as:</p>
+<ul>
+<li>
+<p><strong><a href="https://github.com/openai/roboschool">Roboschool</a></strong> - a set of environments powered by the PyBullet engine,
+    that offer a free alternative to MuJoCo.</p>
+</li>
+<li>
+<p><strong><a href="https://github.com/Breakend/gym-extensions">Gym Extensions</a></strong> - a set of environments that extends Gym for
+    auxiliary tasks (multitask learning, transfer learning, inverse reinforcement learning, etc.)</p>
+</li>
+<li>
+<p><strong><a href="https://github.com/bulletphysics/bullet3/tree/master/examples/pybullet">PyBullet</a></strong> - a physics engine that
+    includes a set of robotics environments.</p>
+</li>
+</ul>
+              
+            </div>
+          </div>
+          <footer>
+  
+    <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
+      
+        <a href="../control_flow/" class="btn btn-neutral float-right" title="Control Flow">Next <span class="icon icon-circle-arrow-right"></span></a>
+      
+      
+        <a href="../../usage/" class="btn btn-neutral" title="Usage"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+      
+    </div>
+  
+
+  <hr/>
+
+  <div role="contentinfo">
+    <!-- Copyright etc -->
+    
+  </div>
+
+  Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
+</footer>
+      
+        </div>
+      </div>
+
+    </section>
+
+  </div>
+
+  <div class="rst-versions" role="note" style="cursor: pointer">
+    <span class="rst-current-version" data-toggle="rst-current-version">
+      
+      
+        <span><a href="../../usage/" style="color: #fcfcfc;">&laquo; Previous</a></span>
+      
+      
+        <span style="margin-left: 15px"><a href="../control_flow/" style="color: #fcfcfc">Next &raquo;</a></span>
+      
+    </span>
+</div>
+    <script>var base_url = '../..';</script>
+    <script src="../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../search/require.js"></script>
+      <script src="../../search/search.js"></script>
+
+</body>
+</html>
diff --git a/docs/design/filters/index.html b/docs/design/filters/index.html
new file mode 100644
index 0000000..f5015af
--- /dev/null
+++ b/docs/design/filters/index.html
@@ -0,0 +1,416 @@
+<!DOCTYPE html>
+<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
+<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
+<head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  
+  
+  <link rel="shortcut icon" href="../../img/favicon.ico">
+  <title>Filters - Reinforcement Learning Coach</title>
+  <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
+
+  <link rel="stylesheet" href="../../css/theme.css" type="text/css" />
+  <link rel="stylesheet" href="../../css/theme_extra.css" type="text/css" />
+  <link rel="stylesheet" href="../../css/highlight.css">
+  <link href="../../extra.css" rel="stylesheet">
+  
+  <script>
+    // Current page data
+    var mkdocs_page_name = "Filters";
+    var mkdocs_page_input_path = "design/filters.md";
+    var mkdocs_page_url = "/design/filters/";
+  </script>
+  
+  <script src="../../js/jquery-2.1.1.min.js"></script>
+  <script src="../../js/modernizr-2.8.3.min.js"></script>
+  <script type="text/javascript" src="../../js/highlight.pack.js"></script> 
+  
+</head>
+
+<body class="wy-body-for-nav" role="document">
+
+  <div class="wy-grid-for-nav">
+
+    
+    <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
+      <div class="wy-side-nav-search">
+        <a href="../.." class="icon icon-home"> Reinforcement Learning Coach</a>
+        <div role="search">
+  <form id ="rtd-search-form" class="wy-form" action="../../search.html" method="get">
+    <input type="text" name="q" placeholder="Search docs" />
+  </form>
+</div>
+      </div>
+
+      <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
+	<ul class="current">
+	  
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../..">Home</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../usage/">Usage</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../network/">Network</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Filters</a>
+    <ul class="subnav">
+            
+    <li class="toctree-l3"><a href="#filters">Filters</a></li>
+    
+        <ul>
+        
+            <li><a class="toctree-l4" href="#input-filters">Input Filters</a></li>
+        
+            <li><a class="toctree-l4" href="#output-filters">Output Filters</a></li>
+        
+        </ul>
+    
+
+    </ul>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
+          
+        </ul>
+      </div>
+      &nbsp;
+    </nav>
+
+    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
+
+      
+      <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
+        <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
+        <a href="../..">Reinforcement Learning Coach</a>
+      </nav>
+
+      
+      <div class="wy-nav-content">
+        <div class="rst-content">
+          <div role="navigation" aria-label="breadcrumbs navigation">
+  <ul class="wy-breadcrumbs">
+    <li><a href="../..">Docs</a> &raquo;</li>
+    
+      
+        
+          <li>Design &raquo;</li>
+        
+      
+    
+    <li>Filters</li>
+    <li class="wy-breadcrumbs-aside">
+      
+    </li>
+  </ul>
+  <hr/>
+</div>
+          <div role="main">
+            <div class="section">
+              
+                <h1 id="filters">Filters</h1>
+<p>Filters are a mechanism in Coach that allows doing pre-processing and post-processing of the internal agent information.
+There are two filter categories -</p>
+<ul>
+<li>
+<p><strong>Input filters</strong> - these are filters that process the information passed <strong>into</strong> the agent from the environment.
+  This information includes the observation and the reward. Input filters therefore allow rescaling observations,
+  normalizing rewards, stack observations, etc.</p>
+</li>
+<li>
+<p><strong>Output filters</strong> - these are filters that process the information going <strong>out</strong> of the agent into the environment.
+  This information includes the action the agent chooses to take. Output filters therefore allow conversion of
+  actions from one space into another. For example, the agent can take <script type="math/tex"> N </script> discrete actions, that will be mapped by
+  the output filter onto <script type="math/tex"> N </script> continuous actions.</p>
+</li>
+</ul>
+<p>Filters can be stacked on top of each other in order to build complex processing flows of the inputs or outputs.</p>
+<p style="text-align: center;">
+
+<img src="../../img/filters.png" alt="Filters mechanism" style="width: 350px;"/>
+
+</p>
+
+<h2 id="input-filters">Input Filters</h2>
+<p>The input filters are separated into two categories - <strong>observation filters</strong> and <strong>reward filters</strong>.</p>
+<h3 id="observation-filters">Observation Filters</h3>
+<ul>
+<li>
+<p><strong>ObservationClippingFilter</strong> - Clips the observation values to a given range of values. For example, if the
+  observation consists of measurements in an arbitrary range, and we want to control the minimum and maximum values
+  of these observations, we can define a range and clip the values of the measurements.</p>
+</li>
+<li>
+<p><strong>ObservationCropFilter</strong> - Crops the size of the observation to a given crop window. For example, in Atari, the
+  observations are images with a shape of 210x160. Usually, we will want to crop the size of the observation to a
+  square of 160x160 before rescaling them.</p>
+</li>
+<li>
+<p><strong>ObservationMoveAxisFilter</strong> - Reorders the axes of the observation. This can be useful when the observation is an
+  image, and we want to move the channel axis to be the last axis instead of the first axis.</p>
+</li>
+<li>
+<p><strong>ObservationNormalizationFilter</strong> - Normalizes the observation values with a running mean and standard deviation of
+  all the observations seen so far. The normalization is performed element-wise. Additionally, when working with
+  multiple workers, the statistics used for the normalization operation are accumulated over all the workers.</p>
+</li>
+<li>
+<p><strong>ObservationReductionBySubPartsNameFilter</strong> - Allows keeping only parts of the observation, by specifying their
+  name. For example, the CARLA environment extracts multiple measurements that can be used by the agent, such as
+  speed and location. If we want to only use the speed, it can be done using this filter.</p>
+</li>
+<li>
+<p><strong>ObservationRescaleSizeByFactorFilter</strong> - Rescales an image observation by some factor. For example, the image size
+  can be reduced by a factor of 2.</p>
+</li>
+<li>
+<p><strong>ObservationRescaleToSizeFilter</strong> - Rescales an image observation to a given size. The target size does not
+  necessarily keep the aspect ratio of the original observation.</p>
+</li>
+<li>
+<p><strong>ObservationRGBToYFilter</strong> - Converts a color image observation specified using the RGB encoding into a grayscale
+  image observation, by keeping only the luminance (Y) channel of the YUV encoding. This can be useful if the colors
+  in the original image are not relevant for solving the task at hand.</p>
+</li>
+<li>
+<p><strong>ObservationSqueezeFilter</strong> - Removes redundant axes from the observation, which are axes with a dimension of 1.</p>
+</li>
+<li>
+<p><strong>ObservationStackingFilter</strong> - Stacks several observations on top of each other. For image observation this will
+  create a 3D blob. The stacking is done in a lazy manner in order to reduce memory consumption. To achieve this,
+  a LazyStack object is used in order to wrap the observations in the stack. For this reason, the
+  ObservationStackingFilter <strong>must</strong> be the last filter in the inputs filters stack.</p>
+</li>
+<li>
+<p><strong>ObservationUint8Filter</strong> - Converts a floating point observation into an unsigned int 8 bit observation. This is
+  mostly useful for reducing memory consumption and is usually used for image observations. The filter will first
+  spread the observation values over the range 0-255 and then discretize them into integer values.</p>
+</li>
+</ul>
+<h3 id="reward-filters">Reward Filters</h3>
+<ul>
+<li>
+<p><strong>RewardClippingFilter</strong> - Clips the reward values into a given range. For example, in DQN, the Atari rewards are
+  clipped into the range -1 and 1 in order to control the scale of the returns.</p>
+</li>
+<li>
+<p><strong>RewardNormalizationFilter</strong> -  Normalizes the reward values with a running mean and standard deviation of
+  all the rewards seen so far. When working with multiple workers, the statistics used for the normalization operation
+  are accumulated over all the workers.</p>
+</li>
+<li>
+<p><strong>RewardRescaleFilter</strong> - Rescales the reward by a given factor. Rescaling the rewards of the environment has been
+  observed to have a large effect (negative or positive) on the behavior of the learning process.</p>
+</li>
+</ul>
+<h2 id="output-filters">Output Filters</h2>
+<p>The output filters only process the actions.</p>
+<h3 id="action-filters">Action Filters</h3>
+<ul>
+<li>
+<p><strong>AttentionDiscretization</strong> - Discretizes an <strong>AttentionActionSpace</strong>. The attention action space defines the actions
+  as choosing sub-boxes in a given box. For example, consider an image of size 100x100, where the action is choosing
+  a crop window of size 20x20 to attend to in the image. AttentionDiscretization allows discretizing the possible crop
+  windows to choose into a finite number of options, and map a discrete action space into those crop windows.</p>
+</li>
+<li>
+<p><strong>BoxDiscretization</strong> - Discretizes a continuous action space into a discrete action space, allowing the usage of
+  agents such as DQN for continuous environments such as MuJoCo. Given the number of bins to discretize into, the
+  original continuous action space is uniformly separated into the given number of bins, each mapped to a discrete
+  action index. For example, if the original actions space is between -1 and 1 and 5 bins were selected, the new action
+  space will consist of 5 actions mapped to -1, -0.5, 0, 0.5 and 1.</p>
+</li>
+<li>
+<p><strong>BoxMasking</strong> - Masks part of the action space to enforce the agent to work in a defined space. For example,
+  if the original action space is between -1 and 1, then this filter can be used in order to constrain the agent actions
+  to the range 0 and 1 instead. This essentially masks the range -1 and 0 from the agent.</p>
+</li>
+<li>
+<p><strong>PartialDiscreteActionSpaceMap</strong> - Partial map of two countable action spaces. For example, consider an environment
+  with a MultiSelect action space (select multiple actions at the same time, such as jump and go right), with 8 actual
+  MultiSelect actions. If we want the agent to be able to select only 5 of those actions by their index (0-4), we can
+  map a discrete action space with 5 actions into the 5 selected MultiSelect actions. This will both allow the agent to
+  use regular discrete actions, and mask 3 of the actions from the agent.</p>
+</li>
+<li>
+<p><strong>FullDiscreteActionSpaceMap</strong> - Full map of two countable action spaces. This works in a similar way to the
+  PartialDiscreteActionSpaceMap, but maps the entire source action space into the entire target action space, without
+  masking any actions.</p>
+</li>
+<li>
+<p><strong>LinearBoxToBoxMap</strong> - A linear mapping of two box action spaces. For example, if the action space of the
+  environment consists of continuous actions between 0 and 1, and we want the agent to choose actions between -1 and 1,
+  the LinearBoxToBoxMap can be used to map the range -1 and 1 to the range 0 and 1 in a linear way. This means that the
+  action -1 will be mapped to 0, the action 1 will be mapped to 1, and the rest of the actions will be linearly mapped
+  between those values.</p>
+</li>
+</ul>
+              
+            </div>
+          </div>
+          <footer>
+  
+    <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
+      
+        <a href="../../algorithms/value_optimization/dqn/" class="btn btn-neutral float-right" title="DQN">Next <span class="icon icon-circle-arrow-right"></span></a>
+      
+      
+        <a href="../network/" class="btn btn-neutral" title="Network"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+      
+    </div>
+  
+
+  <hr/>
+
+  <div role="contentinfo">
+    <!-- Copyright etc -->
+    
+  </div>
+
+  Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
+</footer>
+      
+        </div>
+      </div>
+
+    </section>
+
+  </div>
+
+  <div class="rst-versions" role="note" style="cursor: pointer">
+    <span class="rst-current-version" data-toggle="rst-current-version">
+      
+      
+        <span><a href="../network/" style="color: #fcfcfc;">&laquo; Previous</a></span>
+      
+      
+        <span style="margin-left: 15px"><a href="../../algorithms/value_optimization/dqn/" style="color: #fcfcfc">Next &raquo;</a></span>
+      
+    </span>
+</div>
+    <script>var base_url = '../..';</script>
+    <script src="../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../search/require.js"></script>
+      <script src="../../search/search.js"></script>
+
+</body>
+</html>
diff --git a/docs/design/index.html b/docs/design/index.html
deleted file mode 100644
index fb212fd..0000000
--- a/docs/design/index.html
+++ /dev/null
@@ -1,363 +0,0 @@
-<!DOCTYPE html>
-<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
-<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
-<head>
-  <meta charset="utf-8">
-  <meta name="viewport" content="width=device-width, initial-scale=1.0">
-  
-  <title>Design - Reinforcement Learning Coach Documentation</title>
-  
-
-  <link rel="shortcut icon" href="../img/favicon.ico">
-
-  
-  <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
-
-  <link rel="stylesheet" href="../css/theme.css" type="text/css" />
-  <link rel="stylesheet" href="../css/theme_extra.css" type="text/css" />
-  <link rel="stylesheet" href="../css/highlight.css">
-  <link href="../extra.css" rel="stylesheet">
-
-  
-  <script>
-    // Current page data
-    var mkdocs_page_name = "Design";
-  </script>
-  
-  <script src="../js/jquery-2.1.1.min.js"></script>
-  <script src="../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../js/highlight.pack.js"></script>
-  <script src="../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
-  
-</head>
-
-<body class="wy-body-for-nav" role="document">
-
-  <div class="wy-grid-for-nav">
-
-    
-    <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
-      <div class="wy-side-nav-search">
-        <a href="../index.html" class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
-        <div role="search">
-  <form id ="rtd-search-form" class="wy-form" action="../search.html" method="get">
-    <input type="text" name="q" placeholder="Search docs" />
-  </form>
-</div>
-      </div>
-
-      <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
-          
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../index.html">Home</a>
-        
-    </li>
-<li>
-          
-            <li>
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Design</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#coach-design">Coach Design</a></li>
-                
-                    <li><a class="toctree-l4" href="#network-design">Network Design</a></li>
-                
-                    <li><a class="toctree-l4" href="#keeping-network-copies-in-sync">Keeping Network Copies in Sync</a></li>
-                
-                    <li><a class="toctree-l4" href="#supported-algorithms">Supported Algorithms</a></li>
-                
-            
-            </ul>
-        
-    </li>
-<li>
-          
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
-    <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
-    </ul>
-<li>
-          
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
-    <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../contributing/add_agent/index.html">Adding a New Agent</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
-    </ul>
-<li>
-          
-        </ul>
-      </div>
-      &nbsp;
-    </nav>
-
-    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
-
-      
-      <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
-        <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../index.html">Reinforcement Learning Coach Documentation</a>
-      </nav>
-
-      
-      <div class="wy-nav-content">
-        <div class="rst-content">
-          <div role="navigation" aria-label="breadcrumbs navigation">
-  <ul class="wy-breadcrumbs">
-    <li><a href="../index.html">Docs</a> &raquo;</li>
-    
-      
-    
-    <li>Design</li>
-    <li class="wy-breadcrumbs-aside">
-      
-    </li>
-  </ul>
-  <hr/>
-</div>
-          <div role="main">
-            <div class="section">
-              
-                <h1 id="coach-design">Coach Design</h1>
-<h2 id="network-design">Network Design</h2>
-<p>Each agent has at least one neural network, used as the function approximator, for choosing the actions. The network is designed in a modular way to allow reusability in different agents. It is separated into three main parts:</p>
-<ul>
-<li>
-<p><strong>Input Embedders</strong> - This is the first stage of the network, meant to convert the input into a feature vector representation. It is possible to combine several instances of any of the supported embedders, in order to allow varied combinations of inputs. </p>
-<p>There are two main types of input embedders: </p>
-<ol>
-<li>Image embedder - Convolutional neural network. </li>
-<li>Vector embedder - Multi-layer perceptron. </li>
-</ol>
-</li>
-<li>
-<p><strong>Middlewares</strong> - The middleware gets the output of the input embedder, and processes it into a different representation domain, before sending it through the output head. The goal of the middleware is to enable processing the combined outputs of several input embedders, and pass them through some extra processing. This, for instance, might include an LSTM or just a plain simple FC layer.</p>
-</li>
-<li>
-<p><strong>Output Heads</strong> - The output head is used in order to predict the values required from the network. These might include action-values, state-values or a policy. As with the input embedders, it is possible to use several output heads in the same network. For example, the <em>Actor Critic</em> agent combines two heads - a policy head and a state-value head.
-  In addition, the output heads defines the loss function according to the head type.</p>
-</li>
-</ul>
-<p>​</p>
-<p style="text-align: center;">
-
-<img src="../img/network.png" alt="Network Design" style="width: 400px;"/>
-
-</p>
-
-<h2 id="keeping-network-copies-in-sync">Keeping Network Copies in Sync</h2>
-<p>Most of the reinforcement learning agents include more than one copy of the neural network. These copies serve as counterparts of the main network which are updated in different rates, and are often synchronized either locally or between parallel workers. For easier synchronization of those copies, a wrapper around these copies exposes a simplified API, which allows hiding these complexities from the agent. </p>
-<p style="text-align: center;">
-
-<img src="../img/distributed.png" alt="Distributed Training" style="width: 600px;"/>
-
-</p>
-
-<h2 id="supported-algorithms">Supported Algorithms</h2>
-<p>Coach supports many state-of-the-art reinforcement learning algorithms, which are separated into two main classes - value optimization and policy optimization. A detailed description of those algorithms may be found in the algorithms section.</p>
-<p style="text-align: center;">
-
-<img src="../img/algorithms.png" alt="Supported Algorithms" style="width: 600px;"/>
-
-</p>
-              
-            </div>
-          </div>
-          <footer>
-  
-    <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
-      
-        <a href="../usage/index.html" class="btn btn-neutral float-right" title="Usage"/>Next <span class="icon icon-circle-arrow-right"></span></a>
-      
-      
-        <a href="../index.html" class="btn btn-neutral" title="Home"><span class="icon icon-circle-arrow-left"></span> Previous</a>
-      
-    </div>
-  
-
-  <hr/>
-
-  <div role="contentinfo">
-    <!-- Copyright etc -->
-    
-  </div>
-
-  Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
-</footer>
-	  
-        </div>
-      </div>
-
-    </section>
-
-  </div>
-
-<div class="rst-versions" role="note" style="cursor: pointer">
-    <span class="rst-current-version" data-toggle="rst-current-version">
-      
-      
-        <span><a href="../index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
-      
-      
-        <span style="margin-left: 15px"><a href="../usage/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
-      
-    </span>
-</div>
-
-</body>
-</html>
diff --git a/docs/design/network/index.html b/docs/design/network/index.html
new file mode 100644
index 0000000..daf030c
--- /dev/null
+++ b/docs/design/network/index.html
@@ -0,0 +1,310 @@
+<!DOCTYPE html>
+<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
+<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
+<head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  
+  
+  <link rel="shortcut icon" href="../../img/favicon.ico">
+  <title>Network - Reinforcement Learning Coach</title>
+  <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
+
+  <link rel="stylesheet" href="../../css/theme.css" type="text/css" />
+  <link rel="stylesheet" href="../../css/theme_extra.css" type="text/css" />
+  <link rel="stylesheet" href="../../css/highlight.css">
+  <link href="../../extra.css" rel="stylesheet">
+  
+  <script>
+    // Current page data
+    var mkdocs_page_name = "Network";
+    var mkdocs_page_input_path = "design/network.md";
+    var mkdocs_page_url = "/design/network/";
+  </script>
+  
+  <script src="../../js/jquery-2.1.1.min.js"></script>
+  <script src="../../js/modernizr-2.8.3.min.js"></script>
+  <script type="text/javascript" src="../../js/highlight.pack.js"></script> 
+  
+</head>
+
+<body class="wy-body-for-nav" role="document">
+
+  <div class="wy-grid-for-nav">
+
+    
+    <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
+      <div class="wy-side-nav-search">
+        <a href="../.." class="icon icon-home"> Reinforcement Learning Coach</a>
+        <div role="search">
+  <form id ="rtd-search-form" class="wy-form" action="../../search.html" method="get">
+    <input type="text" name="q" placeholder="Search docs" />
+  </form>
+</div>
+      </div>
+
+      <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
+	<ul class="current">
+	  
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../..">Home</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../usage/">Usage</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../control_flow/">Control Flow</a>
+                </li>
+                <li class=" current">
+                    
+    <a class="current" href="./">Network</a>
+    <ul class="subnav">
+            
+    <li class="toctree-l3"><a href="#network-design">Network Design</a></li>
+    
+        <ul>
+        
+            <li><a class="toctree-l4" href="#keeping-network-copies-in-sync">Keeping Network Copies in Sync</a></li>
+        
+        </ul>
+    
+
+    </ul>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../filters/">Filters</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
+          
+        </ul>
+      </div>
+      &nbsp;
+    </nav>
+
+    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
+
+      
+      <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
+        <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
+        <a href="../..">Reinforcement Learning Coach</a>
+      </nav>
+
+      
+      <div class="wy-nav-content">
+        <div class="rst-content">
+          <div role="navigation" aria-label="breadcrumbs navigation">
+  <ul class="wy-breadcrumbs">
+    <li><a href="../..">Docs</a> &raquo;</li>
+    
+      
+        
+          <li>Design &raquo;</li>
+        
+      
+    
+    <li>Network</li>
+    <li class="wy-breadcrumbs-aside">
+      
+    </li>
+  </ul>
+  <hr/>
+</div>
+          <div role="main">
+            <div class="section">
+              
+                <h1 id="network-design">Network Design</h1>
+<p>Each agent has at least one neural network, used as the function approximator, for choosing the actions. The network is designed in a modular way to allow reusability in different agents. It is separated into three main parts:</p>
+<ul>
+<li>
+<p><strong>Input Embedders</strong> - This is the first stage of the network, meant to convert the input into a feature vector representation. It is possible to combine several instances of any of the supported embedders, in order to allow varied combinations of inputs. </p>
+<p>There are two main types of input embedders: </p>
+<ol>
+<li>Image embedder - Convolutional neural network. </li>
+<li>Vector embedder - Multi-layer perceptron. </li>
+</ol>
+</li>
+<li>
+<p><strong>Middlewares</strong> - The middleware gets the output of the input embedder, and processes it into a different representation domain, before sending it through the output head. The goal of the middleware is to enable processing the combined outputs of several input embedders, and pass them through some extra processing. This, for instance, might include an LSTM or just a plain simple FC layer.</p>
+</li>
+<li>
+<p><strong>Output Heads</strong> - The output head is used in order to predict the values required from the network. These might include action-values, state-values or a policy. As with the input embedders, it is possible to use several output heads in the same network. For example, the <em>Actor Critic</em> agent combines two heads - a policy head and a state-value head.
+  In addition, the output heads defines the loss function according to the head type.</p>
+</li>
+</ul>
+<p>​</p>
+<p style="text-align: center;">
+
+<img src="../../img/network.png" alt="Network Design" style="width: 400px;"/>
+
+</p>
+
+<h2 id="keeping-network-copies-in-sync">Keeping Network Copies in Sync</h2>
+<p>Most of the reinforcement learning agents include more than one copy of the neural network. These copies serve as counterparts of the main network which are updated in different rates, and are often synchronized either locally or between parallel workers. For easier synchronization of those copies, a wrapper around these copies exposes a simplified API, which allows hiding these complexities from the agent. </p>
+<p style="text-align: center;">
+
+<img src="../../img/distributed.png" alt="Distributed Training" style="width: 600px;"/>
+
+</p>
+              
+            </div>
+          </div>
+          <footer>
+  
+    <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
+      
+        <a href="../filters/" class="btn btn-neutral float-right" title="Filters">Next <span class="icon icon-circle-arrow-right"></span></a>
+      
+      
+        <a href="../control_flow/" class="btn btn-neutral" title="Control Flow"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+      
+    </div>
+  
+
+  <hr/>
+
+  <div role="contentinfo">
+    <!-- Copyright etc -->
+    
+  </div>
+
+  Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
+</footer>
+      
+        </div>
+      </div>
+
+    </section>
+
+  </div>
+
+  <div class="rst-versions" role="note" style="cursor: pointer">
+    <span class="rst-current-version" data-toggle="rst-current-version">
+      
+      
+        <span><a href="../control_flow/" style="color: #fcfcfc;">&laquo; Previous</a></span>
+      
+      
+        <span style="margin-left: 15px"><a href="../filters/" style="color: #fcfcfc">Next &raquo;</a></span>
+      
+    </span>
+</div>
+    <script>var base_url = '../..';</script>
+    <script src="../../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../../search/require.js"></script>
+      <script src="../../search/search.js"></script>
+
+</body>
+</html>
diff --git a/docs/diagrams.xml b/docs/diagrams.xml
new file mode 100644
index 0000000..48cff1b
--- /dev/null
+++ b/docs/diagrams.xml
@@ -0,0 +1 @@
+<mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" version="9.0.0" editor="www.draw.io" type="device"><diagram id="33c2a640-8c1e-935c-0e0a-86b5dd5c932c" name="Page-1">7V1td5u4Ev41+dgehHj9mKRNe89pu93tuS/7kdjEZheDL8ZJc3/9lTDCoFGMbEsYXGX37NoyYMwzM5p5ZjS6wfern5+KaL38ms/j9Ma25j9v8Icb27ZxYJH/0ZHX3QiyPW83siiSeT22H/iR/C+uB+sTF9tkHm86B5Z5npbJujs4y7MsnpWdsago8pfuYU952v3WdbSIwcCPWZTC0X8n83K5Gw1caz/+OU4WS/bNyKo/eYxmfy+KfJvV33dj46fqb/fxKmLXqo/fLKN5/tIawh9v8H2R5+Xu1ernfZzSh8se2+68hzc+be67iLNS6gSMd6c8R+k2Zvdc3Vn5yp5G9Xtiega6wXcvy6SMf6yjGf30hQgAGVuWq7T+eJFGG/r0LfJ6lq+SWf16Uxb53/F9nuZFdVXszYL48an5hD1nTEaekjRtHTmP4uBpRsfzrKyFxbbq963jrOqPjEdpssjIWBo/lfRtMavP8sg7+Ijqp/YcF2X8szVUP7JPcb6Ky+KVHMI+dZiEMwG3wmA38LIXF4cJxbIlKtirn29Ui+iiufoeJvKiRkqMmhdoBG0ebZbNeQcR/OjRf2QQrHWAfFERzZN4j1qWZ7EKYH01wBJ8OrjalmdBXB0Brh6zDefgGloG11IRkn5XQ6mNBkhiEZLIUYEkAkhGZJ4axrKG/gfL9xXjh+wD+KkALHTeuxxkXvg+wPs/FwDYaEwbQCX42f2aGGfzW+prkHczCg0FpI1XM7fTJ5RGj3F613gHrYd4V/0jRkvy8ZOnXrz+h34TeYT12z/rL34TmjIqFnHZFdd43vGLIFitZ+8KHj0bK+I0KpPnrjclwqP+hu95Qu6ukYXGwjJJaBxDdo1Nvi1mcX1a278BV3J5qcIWd63dgwDXIthGr63D1vSAzTE3ja2D9+aG/pknIPfwCeCWuBPIi92v3OtEg7WUmvgSamImLMkJi3M9sOyEZfMCfYrBQzpdyuHBO2guO4KlPDiwuzrXBI0tGD1PFBmEKuYtg6IWFFEoiaLDz1QnKaP3y8OoKKDD/Awo0EaMRUZVhRcJZ8ftJiYDWVy+5MXfFatFf2ERzxMaHVgkRkjyjLyYJwSA5HG7ezvWqGHA8AB7XSBFs6O2aAByZP+scCyX9L/xzzX5yVGN3DpPk9krg3YTrdYpPeipyFfNGTXgBEKDdD/S/oBIOwDpWRFHJQUtytoKmmRPFN/88a94p7kZBXOn3knZwF/mxV5QmpN3B0dZFV7M5wkdjtL9VaPHfFuC04xsANlw8ICyASflRrsjGvMSVCsdb2DeZslTXqzSV17/myM2FWD2PR1L82xR/RLyQ8gn5P6odYifom3amRpqGaluY2OkogXuJQwGDJv2BoPiTaRikzB52CQVxDshIM++ZCZiXr1gdmbeHJPFL3SCyZ6pfMWbdZ5RAzNWyEdAIlpdsid0BhQF5pH0cL7kp5QcUdh59uyptR5yPcRIhxl5QDEZv6MPJplF6W39wSqZz+nXCIVgLyaWLNEiy4ZUx9W/UJCCPFehbYGvJyZCFID4axG/9Zw2Et4Xu13gkcchKk37WtyF+KhcEefL33Afg4sCQEZzwnkeH4sksvlXJL32dUovCjkp4ZnBC0mva2mWXhh9Fdsqpl5GVWBVjVGfKskqT8oj/jeRlexxs25EZvIz7UGVUT3TYj5NbvlgqkVCfkzFXOtK4b0k/vR2bdBW4B4jCbRtXWj7AO3ZMs83ByiO8yGt83PX6ijzfIgtCH0b30c5oDDZlD9u4uIZhqjT1M1LxjyBAMlAE5DsGhPxGn8mZctpJO/+rO/hJH9y55TdtKOhtovJ6mNH4mPyAYTju6f5mLzv5/GCpMjH5G/YRYd9TP6+uOPP9jEbxfzVZR1NQNi5OMjhWbpThd3HeoSdD5AuLuyQzOLSCFcxTw8bMbkhV1HoCehJZpaVT9W/FsHDSn5HYpB4fQ1OLUvF3PqR8Oya1I4AHWklIO3Sk3EyZkLCTHCiYkMjIUpEKTESkFfpFiLs0oos4ZhkzQezbVH95v2Hz1GRRI+pyToekXVElgXB1pZ2tCGvEs3nDaTEQBRJRicBqtAvUVF9lDeftzPUBmJZnnS/4rMNsSgpqQRiieLsic/67dCE2a9OqscftSeArNNjE+e91/rzna5nMFDmEllBz1qS8PAJZwcr+Eoi86NEfLTi7J8sztyVhkpe9gown/w4/gSwmL73t9uKVQRGf2TKpI40qCzk61O7tWf5tlxXFabs9LHO/Bf04twha8cwjMGu1/iNjGZ0ETcDd8VAlSX0ziYdz4jxMQwJOcvBh/ycxUiyaRiMy4cKvB0JgiHtiCDJ3orp4/VNHe/H62STz6siC1ZYsditZGWBjeF7juN7eA7A9QDsSBctjGGA2GV8OnXoLcJnFa/y+quMOktG/oKWMNoifyyxoPd6/IJgVH6BwyNvn1rQOVhQBIL0i4csTAlaAjyvdPjBzDIKZhnRemZfoBIqJhnnYq3iRrMqXVNzAdEyRnF3ASVrXGGsycUCe29hAwKBddFdwW4CgqPTQy6+WF8rB8aBVTn9lRjjS1ZsIsfpg1VXAacj0S9kRF7iuVUhrj0qPxH4UI4lFIRzi0SQEx687pB8kiNY95xnVHfemETId1adDW5oj9xyRpe4M27azBzyM4dYAoaYOZjSidofpEm1lD1/ArB32h4YsuFowB3/coBDKnEg6Ebj7itqQsXXwCNPQCEJu1CpaAnnQgopjaMi22tnbZCVe4BXvviqYcobWAVMP7I1Ub4eJFa26/muaK92llphmjG50ibXv1zXYW9SJTBvPvzOkqr6sYzFV3fCLtqnl3DzVwp5CRjSH2e7CkxDcnQu4atBGHW8SAyM5wXI8VwLB7xzECAl0SPfAmW/lYHuQqzAOZxzcL1zT3B6+l7DiFxx32sP0mhpntPk+lNOadRsu7ppN6qoUu80HiqTlaBp2zTJtkEzH0AGPAFnrmtBhScKhbyUYjZPnjtYev/d0r177iik72p0bmmITH3h5lPyqq622F1ls6bdJXdjaU5QqMfJjbU/ag1X38tGjTgdK0584wQUiBo8axInH7ruZnOQg3DZoQ+2cbicm+5DN12cRDEIHkLQsS+HoM5U9i++b4UrmJj1bbTkw2S2saXHaqIbXE4ToScd09cVkXUVfs2wnKQA3N5qA11paR+mpet5knVAN+1xTwOZC+2xYHG5tq59vg8wo8QK+4E1GF2D235qLWT3PJDfJoK4IgEfMEHHM1VRUTLuq77BauwhoT+z/kIBO0YG60Po7f4Vl+VrDXi0LXMylBflMl/Q7v1faPzfFtg3iLGDkvKmcLQpLjbhtikuX3ZJjTR3Jd0AHQYzUD7GwzyeW3vC/I2RcImOw3lfYJWR/E55AT9z+Jp6gsGb7uHzPJ7LPPoEtorr7V/Pe7HdE85mDINpNe7RR9D7IqUaV5LIcTlhaLZ7Plqp+Ct5mpbzA/llKwGkNeToE3CPDoJbwopJ+AC6l8lqXeSm/+jNKU4lv+mC1Uuw6YobgitZ9HS+rRR5eva4bCUvNt6p66V9vhs9uJI296PP9IHfeOwJdl9zFf6WbMXNVUKR+6Erg9U0bL9oDuvaq88E+6UPSQOEEh0rJlQeegxrvi8QDdRQssjiV4RZGK4ed0Sd+R3WOeQsLCEJS57Ymhbpr6IsWgjWd43PoxqDUu7bgB1sAyCaj5Xo5MX2DfZmQfz4JKOT8ygOnmYqlA8pSogQjDjQkGjrdVE1AOLZlpNQg2tp0vg5TrVqn9rZcRy6Bzo1CDKR+nRPImjRo3uB/Yg9qflw7sbB3Dld99SgxNU7IkHVfCAommetMM7bolFUYa3LDyaqS538a3KDx6HofJcLxBiDAYq3yBRvFF0GJQRRGlLRoUtrFH1qig7W5gyr6Bfzpqel6BhstupDlPQp+pC12UbR9Sg6RkDRgwEVXaK9ugmb+8NmW7SsWVvYTNxAAJuJm1XEzbZoHbOuuLnhWsw02wcTZyOxgN7QNs0iEzhfgabzIRnGA/rTyDaKflLgPKyiD5kpNoo+UOA8rKIbhkwKJRA4Y3vAwBkZhmz6ig4CZ4wHDJzRxRiy0PNxJKXoMSKq7l94Rnf52nrkCFLMiFULq9f1IUmyOHtOijxbGY3XofHIx1CW4LSB2FSiXukNW3YKbi4/2dsORE3MlqnoAoiQYcuU6J/Ll+rZnkD7tLFltmHLJGHiHSOBkdTmW9uGLbsCTedrObE9oG/N9Noo+mGU+LUWAyu6Ycumr+gev6vOsIpu2DIplPj1YUTRIUr6FN2wZdNXdN8Diu4PqOgXC5ynxZZ5FmA4sLAiQRdbZsNQ2bBl09R4DwHmFQtqSPWxZRjG64B2sZ6T+MXgKWcbOCfAEfRtRaGvC00Y14OFkgZNeS4bI8BlD4sn9L0NKaqCFEUC31wfKSrY3X4gz2paIRQgRZE3YAiFRdliE0JNTNN5ug35A4ZQ+GJ7kk1L0QEpOqyiiwJdo+jTUnRAig6r6GY1vRRKgBRFHkRJm6I7omS0UfRpKTogRRHr2z2EojuiTPkgit48Ufk2UrItAfflQ54YMQCPAMQDYTO/MbRvwaAZC6Mtfg+w0zCzAWYf2yzmIPANv0msZlSBHvqsLVeHChHYcjXLKR2Yyr5diPC82tZf5+GHIdXsC4p8fUHWQo0lhSRIkq23tMskgaIUUFmTdaAa+VFiTUMOtVBUmi3axlfF/ifIgSFtvi0Nbv36FgLyeGDkJPK5g7YpPojBW82IUQMH11G4dyuE2uJAhIbpPEyADTrwwx1OZVsPI7CjiiXZp/2EVrzIgbFt/riJi2fyNPKMhTKPhTB+ovC+21R40/AJ+eufu/iHi6CK+CUq5ooudrQLQATvofo74I0N7Awg/4BydNXKRopMlI15CRU46qKt0ZQ4BK4oOh+vfXqzjbq8dWr3Tq8n1otZJ8d1VFmn/ksptE4uDMij2YmGyRgSRYak6fJ8GUPiQruhdsevvt0Sprzh10F5OdG2sR3tu+7YGwZP+ZZfzdd3Qk2ie+Sp0EvvWAMrzRfkmQ4TvJy3NPAgRJqCzq5nsCfy2xQe65jSIXvUqDTWoNIH9vAzGt3nrQi2eqkHB9FoiUUII/IV94Jlg2D2bXk8FRpXAM3OBF7Mu8ShBXb/OHXnP9HFLN7KqPQwJXgTIktr+jJZRRStxqX7QsXqe75JKo8Uf3jMyzJfHZS3xk/k3cKSavtdtFnHM4rOU/KTmrq76itv2ajFRsjrZVmuN5V7+0D+XSTlcvv4nlhF8uYbDd+z6MfrpoxXGzIwy6PZkvy/iF7oA4k2FZH3kKwW5L9RusgLcvpq836dLdRMKQHXxW+/tXInEywgw9SwYW6oeUa5Kh8RcTPKobzTG2arY6PYGvSOjQrkheg4e3TsjlvI4nJbtl8L51s2KQgPn3Bz7oZYiIXQRl7HI6/WZOU1xJZueUVGXscmr/Z05TXQbl8FAYSR18vKK56svIId8jQIrG5KxAjs0QLrqhdY6YDGc4w8jE0enOkasKbPvT4DpiNNYwT2LIH1Jiyw/CbW6gXWMwI7NoH1JyywfqBbYH0jsGMT2HC6Amv3uAQSZwS47wzsHjxDgVIERinGphTBhJXC0c5Mwfrbf1Uvbeu3dZmskv/VhbicVI+x7P6g/Khe4hKCpAyrrW/X24vK0JQsUPJNxuaChsZ3BIbGG83s6/Azo9fLKPKmyVNtaHyTshmbwPoa4ht5A2ZSImOTh3A0KefjDZjL55zVGzCTEhmbwPoTnnH5ljIaBBaWsYrXsP/yvjzCPL3r4UGdeZOsuKRp8UVz4WjKBYFp6WUNmo2TtLEGvklWjE5gNZQHDCWwDgsO9QksbCzxPU+T2avhuY6eGwcmugyjfklTI2LUQw1ut7w8QML6H6ukNNr7RotaUNs6qPayKvYWWh9+/zYQTmeugx00m2DxMAn6nYnW0KlBCbY5+P2PdwYo0coy3rcOHAiUrw0oGwB1l+cleWzRek1gaXWl2PWYqDDkBw2qQP1AYYAAVWQJYFWyXDCA3QY/GO2TSroOaiYhofb1671BCTgdFnA6Qvv9kEDB7o/fPhqgBEDxsZ3I7umDCe6B8e3djzJewynr93df4qjIkmxhpjMZXHk7KcI1FOHqqsBVQLbcfjHASQCH+KyQHzoAOIZlBzgVsXUAm1F8u30wwEkAZ4MleQLg9FlSSIrcR2VMG2vMaBsqExacqI58K9RB7WgoIE8evhvgJIBzAZ8ioL20qWMI+ZQm2/Cp7rtvYJSxqiG/iiXEAMZm33n1ONoAE5N7GCz3wHzYbu5hNIsIQdKyd/kJZsKrbflJaGrURiewlyyyZT6okYfxyMNoliMdb8BcJzh4hgIDBpm721lJnr5t3RdJSVvoGrep121ysN3vNrHCCvVeE6T1Pnz4/skgJ4Ec9ngNc/whmfMQUnf3abKucovW9++/GRBl1M92ZEBkk7N6ECGNZ6CT5A1s3kUfWP8gkXcXL6PnJC+qdvL3aW6SILJYwmQxpGSbDvKqobRZtYDxvS/ge9uWwPe2rQvGYrZlFsCOTR5YofoIYjHU9fuwhfrIJNs7eMbZsZhtGfZzbAJrjYf9PFpgbYR0C6xhP0cnsONhu44WWLfnDAUCa+jZ0QnsiJaUHyuwHlfFrEFgIT37iQSTy69RFi0msjnz0KvnOEIvFOxEo239TeMwtPmEiKqkbf3xxcAFC3489z2vRLIlP0rwgvzrZwOUSK9YbcYhmNjWjOphggzr121aJu8i02BDTIn7joxescph9YBBXvVHnD69W6fRq4FLQJ1a/WBpM4II1j1+zDbx6pH8QIMVqJELcD9WbEWMeqxgqePn23uTn5CZwmzgazT7IgyQa7LZhrot6B62ZtGhyNfAl4QJLg6lHUvy7B2riDGAgfoXHjBPMH3p8jWQYJnolx+fDU5QsVyvHye2d5p6nCCZ8fV2V6lkkOr330VGUJFOkbdFnpdtnqpimfJ5TI/4Pw==</diagram></mxfile>
\ No newline at end of file
diff --git a/docs/fonts/fontawesome-webfont.eot b/docs/fonts/fontawesome-webfont.eot
index 9b6afae..0662cb9 100644
Binary files a/docs/fonts/fontawesome-webfont.eot and b/docs/fonts/fontawesome-webfont.eot differ
diff --git a/docs/fonts/fontawesome-webfont.svg b/docs/fonts/fontawesome-webfont.svg
index fd7bedf..2edb4ec 100644
--- a/docs/fonts/fontawesome-webfont.svg
+++ b/docs/fonts/fontawesome-webfont.svg
@@ -1,655 +1,399 @@
-<?xml version="1.0" encoding="UTF-8" standalone="no"?>
-<!DOCTYPE svg  PUBLIC '-//W3C//DTD SVG 1.1//EN'  'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd'>
+<?xml version="1.0" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd" >
 <svg xmlns="http://www.w3.org/2000/svg">
- <defs>
-  <font id="fontawesomeregular" horiz-adv-x="1536">
-   <font-face ascent="1536" descent="-256" units-per-em="1792"/>
-   <missing-glyph horiz-adv-x="448"/>
-   <glyph horiz-adv-x="448" unicode=" "/>
-   <glyph horiz-adv-x="448" unicode="	"/>
-   <glyph horiz-adv-x="448" unicode=" "/>
-   <glyph horiz-adv-x="1792" unicode="¨"/>
-   <glyph horiz-adv-x="1792" unicode="©"/>
-   <glyph horiz-adv-x="1792" unicode="®"/>
-   <glyph horiz-adv-x="1792" unicode="´"/>
-   <glyph horiz-adv-x="1792" unicode="Æ"/>
-   <glyph horiz-adv-x="1792" unicode="Ø"/>
-   <glyph horiz-adv-x="768" unicode=" "/>
-   <glyph horiz-adv-x="1537" unicode=" "/>
-   <glyph horiz-adv-x="768" unicode=" "/>
-   <glyph horiz-adv-x="1537" unicode=" "/>
-   <glyph horiz-adv-x="512" unicode=" "/>
-   <glyph horiz-adv-x="384" unicode=" "/>
-   <glyph horiz-adv-x="256" unicode=" "/>
-   <glyph horiz-adv-x="256" unicode=" "/>
-   <glyph horiz-adv-x="192" unicode=" "/>
-   <glyph horiz-adv-x="307" unicode=" "/>
-   <glyph horiz-adv-x="85" unicode=" "/>
-   <glyph horiz-adv-x="307" unicode=" "/>
-   <glyph horiz-adv-x="384" unicode=" "/>
-   <glyph horiz-adv-x="1792" unicode="™"/>
-   <glyph horiz-adv-x="1792" unicode="∞"/>
-   <glyph horiz-adv-x="1792" unicode="≠"/>
-   <glyph horiz-adv-x="500" unicode="◼" d="M0 0z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1699 1350q0 -35 -43 -78l-632 -632v-768h320q26 0 45 -19t19 -45t-19 -45t-45 -19h-896q-26 0 -45 19t-19 45t19 45t45 19h320v768l-632 632q-43 43 -43 78q0 23 18 36.5t38 17.5t43 4h1408q23 0 43 -4t38 -17.5t18 -36.5z"/>
-   <glyph d="M1536 1312v-1120q0 -50 -34 -89t-86 -60.5t-103.5 -32t-96.5 -10.5t-96.5 10.5t-103.5 32t-86 60.5t-34 89t34 89t86 60.5t103.5 32t96.5 10.5q105 0 192 -39v537l-768 -237v-709q0 -50 -34 -89t-86 -60.5t-103.5 -32t-96.5 -10.5t-96.5 10.5t-103.5 32t-86 60.5t-34 89 t34 89t86 60.5t103.5 32t96.5 10.5q105 0 192 -39v967q0 31 19 56.5t49 35.5l832 256q12 4 28 4q40 0 68 -28t28 -68z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1152 704q0 185 -131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5t316.5 131.5t131.5 316.5zM1664 -128q0 -52 -38 -90t-90 -38q-54 0 -90 38l-343 342q-179 -124 -399 -124q-143 0 -273.5 55.5t-225 150t-150 225t-55.5 273.5 t55.5 273.5t150 225t225 150t273.5 55.5t273.5 -55.5t225 -150t150 -225t55.5 -273.5q0 -220 -124 -399l343 -343q37 -37 37 -90z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1664 32v768q-32 -36 -69 -66q-268 -206 -426 -338q-51 -43 -83 -67t-86.5 -48.5t-102.5 -24.5h-1h-1q-48 0 -102.5 24.5t-86.5 48.5t-83 67q-158 132 -426 338q-37 30 -69 66v-768q0 -13 9.5 -22.5t22.5 -9.5h1472q13 0 22.5 9.5t9.5 22.5zM1664 1083v11v13.5t-0.5 13 t-3 12.5t-5.5 9t-9 7.5t-14 2.5h-1472q-13 0 -22.5 -9.5t-9.5 -22.5q0 -168 147 -284q193 -152 401 -317q6 -5 35 -29.5t46 -37.5t44.5 -31.5t50.5 -27.5t43 -9h1h1q20 0 43 9t50.5 27.5t44.5 31.5t46 37.5t35 29.5q208 165 401 317q54 43 100.5 115.5t46.5 131.5z M1792 1120v-1088q0 -66 -47 -113t-113 -47h-1472q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1472q66 0 113 -47t47 -113z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M896 -128q-26 0 -44 18l-624 602q-10 8 -27.5 26t-55.5 65.5t-68 97.5t-53.5 121t-23.5 138q0 220 127 344t351 124q62 0 126.5 -21.5t120 -58t95.5 -68.5t76 -68q36 36 76 68t95.5 68.5t120 58t126.5 21.5q224 0 351 -124t127 -344q0 -221 -229 -450l-623 -600 q-18 -18 -44 -18z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1664 889q0 -22 -26 -48l-363 -354l86 -500q1 -7 1 -20q0 -21 -10.5 -35.5t-30.5 -14.5q-19 0 -40 12l-449 236l-449 -236q-22 -12 -40 -12q-21 0 -31.5 14.5t-10.5 35.5q0 6 2 20l86 500l-364 354q-25 27 -25 48q0 37 56 46l502 73l225 455q19 41 49 41t49 -41l225 -455 l502 -73q56 -9 56 -46z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1137 532l306 297l-422 62l-189 382l-189 -382l-422 -62l306 -297l-73 -421l378 199l377 -199zM1664 889q0 -22 -26 -48l-363 -354l86 -500q1 -7 1 -20q0 -50 -41 -50q-19 0 -40 12l-449 236l-449 -236q-22 -12 -40 -12q-21 0 -31.5 14.5t-10.5 35.5q0 6 2 20l86 500 l-364 354q-25 27 -25 48q0 37 56 46l502 73l225 455q19 41 49 41t49 -41l225 -455l502 -73q56 -9 56 -46z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1408 131q0 -120 -73 -189.5t-194 -69.5h-874q-121 0 -194 69.5t-73 189.5q0 53 3.5 103.5t14 109t26.5 108.5t43 97.5t62 81t85.5 53.5t111.5 20q9 0 42 -21.5t74.5 -48t108 -48t133.5 -21.5t133.5 21.5t108 48t74.5 48t42 21.5q61 0 111.5 -20t85.5 -53.5t62 -81 t43 -97.5t26.5 -108.5t14 -109t3.5 -103.5zM1088 1024q0 -159 -112.5 -271.5t-271.5 -112.5t-271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5t271.5 -112.5t112.5 -271.5z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M384 -64v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM384 320v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM384 704v128q0 26 -19 45t-45 19h-128 q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1408 -64v512q0 26 -19 45t-45 19h-768q-26 0 -45 -19t-19 -45v-512q0 -26 19 -45t45 -19h768q26 0 45 19t19 45zM384 1088v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45 t45 -19h128q26 0 45 19t19 45zM1792 -64v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1408 704v512q0 26 -19 45t-45 19h-768q-26 0 -45 -19t-19 -45v-512q0 -26 19 -45t45 -19h768q26 0 45 19t19 45zM1792 320v128 q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1792 704v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1792 1088v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19 t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1920 1248v-1344q0 -66 -47 -113t-113 -47h-1600q-66 0 -113 47t-47 113v1344q0 66 47 113t113 47h1600q66 0 113 -47t47 -113z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M768 512v-384q0 -52 -38 -90t-90 -38h-512q-52 0 -90 38t-38 90v384q0 52 38 90t90 38h512q52 0 90 -38t38 -90zM768 1280v-384q0 -52 -38 -90t-90 -38h-512q-52 0 -90 38t-38 90v384q0 52 38 90t90 38h512q52 0 90 -38t38 -90zM1664 512v-384q0 -52 -38 -90t-90 -38 h-512q-52 0 -90 38t-38 90v384q0 52 38 90t90 38h512q52 0 90 -38t38 -90zM1664 1280v-384q0 -52 -38 -90t-90 -38h-512q-52 0 -90 38t-38 90v384q0 52 38 90t90 38h512q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M512 288v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM512 800v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1152 288v-192q0 -40 -28 -68t-68 -28h-320 q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM512 1312v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1152 800v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28 h320q40 0 68 -28t28 -68zM1792 288v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1152 1312v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1792 800v-192 q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1792 1312v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M512 288v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM512 800v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1792 288v-192q0 -40 -28 -68t-68 -28h-960 q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h960q40 0 68 -28t28 -68zM512 1312v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1792 800v-192q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v192q0 40 28 68t68 28 h960q40 0 68 -28t28 -68zM1792 1312v-192q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h960q40 0 68 -28t28 -68z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1671 970q0 -40 -28 -68l-724 -724l-136 -136q-28 -28 -68 -28t-68 28l-136 136l-362 362q-28 28 -28 68t28 68l136 136q28 28 68 28t68 -28l294 -295l656 657q28 28 68 28t68 -28l136 -136q28 -28 28 -68z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1298 214q0 -40 -28 -68l-136 -136q-28 -28 -68 -28t-68 28l-294 294l-294 -294q-28 -28 -68 -28t-68 28l-136 136q-28 28 -28 68t28 68l294 294l-294 294q-28 28 -28 68t28 68l136 136q28 28 68 28t68 -28l294 -294l294 294q28 28 68 28t68 -28l136 -136q28 -28 28 -68 t-28 -68l-294 -294l294 -294q28 -28 28 -68z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1024 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-224v-224q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v224h-224q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h224v224q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5v-224h224 q13 0 22.5 -9.5t9.5 -22.5zM1152 704q0 185 -131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5t316.5 131.5t131.5 316.5zM1664 -128q0 -53 -37.5 -90.5t-90.5 -37.5q-54 0 -90 38l-343 342q-179 -124 -399 -124q-143 0 -273.5 55.5 t-225 150t-150 225t-55.5 273.5t55.5 273.5t150 225t225 150t273.5 55.5t273.5 -55.5t225 -150t150 -225t55.5 -273.5q0 -220 -124 -399l343 -343q37 -37 37 -90z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1024 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-576q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h576q13 0 22.5 -9.5t9.5 -22.5zM1152 704q0 185 -131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5t316.5 131.5t131.5 316.5z M1664 -128q0 -53 -37.5 -90.5t-90.5 -37.5q-54 0 -90 38l-343 342q-179 -124 -399 -124q-143 0 -273.5 55.5t-225 150t-150 225t-55.5 273.5t55.5 273.5t150 225t225 150t273.5 55.5t273.5 -55.5t225 -150t150 -225t55.5 -273.5q0 -220 -124 -399l343 -343q37 -37 37 -90z "/>
-   <glyph d="M1536 640q0 -156 -61 -298t-164 -245t-245 -164t-298 -61t-298 61t-245 164t-164 245t-61 298q0 182 80.5 343t226.5 270q43 32 95.5 25t83.5 -50q32 -42 24.5 -94.5t-49.5 -84.5q-98 -74 -151.5 -181t-53.5 -228q0 -104 40.5 -198.5t109.5 -163.5t163.5 -109.5 t198.5 -40.5t198.5 40.5t163.5 109.5t109.5 163.5t40.5 198.5q0 121 -53.5 228t-151.5 181q-42 32 -49.5 84.5t24.5 94.5q31 43 84 50t95 -25q146 -109 226.5 -270t80.5 -343zM896 1408v-640q0 -52 -38 -90t-90 -38t-90 38t-38 90v640q0 52 38 90t90 38t90 -38t38 -90z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M256 96v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM640 224v-320q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v320q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1024 480v-576q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23 v576q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1408 864v-960q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v960q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1792 1376v-1472q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v1472q0 14 9 23t23 9h192q14 0 23 -9t9 -23z"/>
-   <glyph d="M1024 640q0 106 -75 181t-181 75t-181 -75t-75 -181t75 -181t181 -75t181 75t75 181zM1536 749v-222q0 -12 -8 -23t-20 -13l-185 -28q-19 -54 -39 -91q35 -50 107 -138q10 -12 10 -25t-9 -23q-27 -37 -99 -108t-94 -71q-12 0 -26 9l-138 108q-44 -23 -91 -38 q-16 -136 -29 -186q-7 -28 -36 -28h-222q-14 0 -24.5 8.5t-11.5 21.5l-28 184q-49 16 -90 37l-141 -107q-10 -9 -25 -9q-14 0 -25 11q-126 114 -165 168q-7 10 -7 23q0 12 8 23q15 21 51 66.5t54 70.5q-27 50 -41 99l-183 27q-13 2 -21 12.5t-8 23.5v222q0 12 8 23t19 13 l186 28q14 46 39 92q-40 57 -107 138q-10 12 -10 24q0 10 9 23q26 36 98.5 107.5t94.5 71.5q13 0 26 -10l138 -107q44 23 91 38q16 136 29 186q7 28 36 28h222q14 0 24.5 -8.5t11.5 -21.5l28 -184q49 -16 90 -37l142 107q9 9 24 9q13 0 25 -10q129 -119 165 -170q7 -8 7 -22 q0 -12 -8 -23q-15 -21 -51 -66.5t-54 -70.5q26 -50 41 -98l183 -28q13 -2 21 -12.5t8 -23.5z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M512 800v-576q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v576q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM768 800v-576q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v576q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM1024 800v-576q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v576 q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM1152 76v948h-896v-948q0 -22 7 -40.5t14.5 -27t10.5 -8.5h832q3 0 10.5 8.5t14.5 27t7 40.5zM480 1152h448l-48 117q-7 9 -17 11h-317q-10 -2 -17 -11zM1408 1120v-64q0 -14 -9 -23t-23 -9h-96v-948q0 -83 -47 -143.5t-113 -60.5h-832 q-66 0 -113 58.5t-47 141.5v952h-96q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h309l70 167q15 37 54 63t79 26h320q40 0 79 -26t54 -63l70 -167h309q14 0 23 -9t9 -23z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1408 544v-480q0 -26 -19 -45t-45 -19h-384v384h-256v-384h-384q-26 0 -45 19t-19 45v480q0 1 0.5 3t0.5 3l575 474l575 -474q1 -2 1 -6zM1631 613l-62 -74q-8 -9 -21 -11h-3q-13 0 -21 7l-692 577l-692 -577q-12 -8 -24 -7q-13 2 -21 11l-62 74q-8 10 -7 23.5t11 21.5 l719 599q32 26 76 26t76 -26l244 -204v195q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-408l219 -182q10 -8 11 -21.5t-7 -23.5z"/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z " unicode=""/>
-   <glyph d="M896 992v-448q0 -14 -9 -23t-23 -9h-320q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h224v352q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1111 540v4l-24 320q-1 13 -11 22.5t-23 9.5h-186q-13 0 -23 -9.5t-11 -22.5l-24 -320v-4q-1 -12 8 -20t21 -8h244q12 0 21 8t8 20zM1870 73q0 -73 -46 -73h-704q13 0 22 9.5t8 22.5l-20 256q-1 13 -11 22.5t-23 9.5h-272q-13 0 -23 -9.5t-11 -22.5l-20 -256 q-1 -13 8 -22.5t22 -9.5h-704q-46 0 -46 73q0 54 26 116l417 1044q8 19 26 33t38 14h339q-13 0 -23 -9.5t-11 -22.5l-15 -192q-1 -14 8 -23t22 -9h166q13 0 22 9t8 23l-15 192q-1 13 -11 22.5t-23 9.5h339q20 0 38 -14t26 -33l417 -1044q26 -62 26 -116z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1280 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1536 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1664 416v-320q0 -40 -28 -68t-68 -28h-1472q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h465l135 -136 q58 -56 136 -56t136 56l136 136h464q40 0 68 -28t28 -68zM1339 985q17 -41 -14 -70l-448 -448q-18 -19 -45 -19t-45 19l-448 448q-31 29 -14 70q17 39 59 39h256v448q0 26 19 45t45 19h256q26 0 45 -19t19 -45v-448h256q42 0 59 -39z"/>
-   <glyph d="M1120 608q0 -12 -10 -24l-319 -319q-11 -9 -23 -9t-23 9l-320 320q-15 16 -7 35q8 20 30 20h192v352q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-352h192q14 0 23 -9t9 -23zM768 1184q-148 0 -273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273 t-73 273t-198 198t-273 73zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1118 660q-8 -20 -30 -20h-192v-352q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v352h-192q-14 0 -23 9t-9 23q0 12 10 24l319 319q11 9 23 9t23 -9l320 -320q15 -16 7 -35zM768 1184q-148 0 -273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198 t73 273t-73 273t-198 198t-273 73zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1023 576h316q-1 3 -2.5 8t-2.5 8l-212 496h-708l-212 -496q-1 -2 -2.5 -8t-2.5 -8h316l95 -192h320zM1536 546v-482q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v482q0 62 25 123l238 552q10 25 36.5 42t52.5 17h832q26 0 52.5 -17t36.5 -42l238 -552 q25 -61 25 -123z" unicode=""/>
-   <glyph d="M1184 640q0 -37 -32 -55l-544 -320q-15 -9 -32 -9q-16 0 -32 8q-32 19 -32 56v640q0 37 32 56q33 18 64 -1l544 -320q32 -18 32 -55zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1536 1280v-448q0 -26 -19 -45t-45 -19h-448q-42 0 -59 40q-17 39 14 69l138 138q-148 137 -349 137q-104 0 -198.5 -40.5t-163.5 -109.5t-109.5 -163.5t-40.5 -198.5t40.5 -198.5t109.5 -163.5t163.5 -109.5t198.5 -40.5q119 0 225 52t179 147q7 10 23 12q14 0 25 -9 l137 -138q9 -8 9.5 -20.5t-7.5 -22.5q-109 -132 -264 -204.5t-327 -72.5q-156 0 -298 61t-245 164t-164 245t-61 298t61 298t164 245t245 164t298 61q147 0 284.5 -55.5t244.5 -156.5l130 129q29 31 70 14q39 -17 39 -59z" unicode=""/>
-   <glyph d="M1511 480q0 -5 -1 -7q-64 -268 -268 -434.5t-478 -166.5q-146 0 -282.5 55t-243.5 157l-129 -129q-19 -19 -45 -19t-45 19t-19 45v448q0 26 19 45t45 19h448q26 0 45 -19t19 -45t-19 -45l-137 -137q71 -66 161 -102t187 -36q134 0 250 65t186 179q11 17 53 117 q8 23 30 23h192q13 0 22.5 -9.5t9.5 -22.5zM1536 1280v-448q0 -26 -19 -45t-45 -19h-448q-26 0 -45 19t-19 45t19 45l138 138q-148 137 -349 137q-134 0 -250 -65t-186 -179q-11 -17 -53 -117q-8 -23 -30 -23h-199q-13 0 -22.5 9.5t-9.5 22.5v7q65 268 270 434.5t480 166.5 q146 0 284 -55.5t245 -156.5l130 129q19 19 45 19t45 -19t19 -45z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M384 352v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 608v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M384 864v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1536 352v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-960q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h960q13 0 22.5 -9.5t9.5 -22.5z M1536 608v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-960q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h960q13 0 22.5 -9.5t9.5 -22.5zM1536 864v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-960q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h960q13 0 22.5 -9.5 t9.5 -22.5zM1664 160v832q0 13 -9.5 22.5t-22.5 9.5h-1472q-13 0 -22.5 -9.5t-9.5 -22.5v-832q0 -13 9.5 -22.5t22.5 -9.5h1472q13 0 22.5 9.5t9.5 22.5zM1792 1248v-1088q0 -66 -47 -113t-113 -47h-1472q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1472q66 0 113 -47 t47 -113z"/>
-   <glyph horiz-adv-x="1152" unicode="" d="M320 768h512v192q0 106 -75 181t-181 75t-181 -75t-75 -181v-192zM1152 672v-576q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v576q0 40 28 68t68 28h32v192q0 184 132 316t316 132t316 -132t132 -316v-192h32q40 0 68 -28t28 -68z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M320 1280q0 -72 -64 -110v-1266q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v1266q-64 38 -64 110q0 53 37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1792 1216v-763q0 -25 -12.5 -38.5t-39.5 -27.5q-215 -116 -369 -116q-61 0 -123.5 22t-108.5 48 t-115.5 48t-142.5 22q-192 0 -464 -146q-17 -9 -33 -9q-26 0 -45 19t-19 45v742q0 32 31 55q21 14 79 43q236 120 421 120q107 0 200 -29t219 -88q38 -19 88 -19q54 0 117.5 21t110 47t88 47t54.5 21q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1664 650q0 -166 -60 -314l-20 -49l-185 -33q-22 -83 -90.5 -136.5t-156.5 -53.5v-32q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v576q0 14 9 23t23 9h64q14 0 23 -9t9 -23v-32q71 0 130 -35.5t93 -95.5l68 12q29 95 29 193q0 148 -88 279t-236.5 209t-315.5 78 t-315.5 -78t-236.5 -209t-88 -279q0 -98 29 -193l68 -12q34 60 93 95.5t130 35.5v32q0 14 9 23t23 9h64q14 0 23 -9t9 -23v-576q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v32q-88 0 -156.5 53.5t-90.5 136.5l-185 33l-20 49q-60 148 -60 314q0 151 67 291t179 242.5 t266 163.5t320 61t320 -61t266 -163.5t179 -242.5t67 -291z"/>
-   <glyph horiz-adv-x="768" unicode="" d="M768 1184v-1088q0 -26 -19 -45t-45 -19t-45 19l-333 333h-262q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h262l333 333q19 19 45 19t45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1152" unicode="" d="M768 1184v-1088q0 -26 -19 -45t-45 -19t-45 19l-333 333h-262q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h262l333 333q19 19 45 19t45 -19t19 -45zM1152 640q0 -76 -42.5 -141.5t-112.5 -93.5q-10 -5 -25 -5q-26 0 -45 18.5t-19 45.5q0 21 12 35.5t29 25t34 23t29 35.5 t12 57t-12 57t-29 35.5t-34 23t-29 25t-12 35.5q0 27 19 45.5t45 18.5q15 0 25 -5q70 -27 112.5 -93t42.5 -142z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M768 1184v-1088q0 -26 -19 -45t-45 -19t-45 19l-333 333h-262q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h262l333 333q19 19 45 19t45 -19t19 -45zM1152 640q0 -76 -42.5 -141.5t-112.5 -93.5q-10 -5 -25 -5q-26 0 -45 18.5t-19 45.5q0 21 12 35.5t29 25t34 23t29 35.5 t12 57t-12 57t-29 35.5t-34 23t-29 25t-12 35.5q0 27 19 45.5t45 18.5q15 0 25 -5q70 -27 112.5 -93t42.5 -142zM1408 640q0 -153 -85 -282.5t-225 -188.5q-13 -5 -25 -5q-27 0 -46 19t-19 45q0 39 39 59q56 29 76 44q74 54 115.5 135.5t41.5 173.5t-41.5 173.5 t-115.5 135.5q-20 15 -76 44q-39 20 -39 59q0 26 19 45t45 19q13 0 26 -5q140 -59 225 -188.5t85 -282.5zM1664 640q0 -230 -127 -422.5t-338 -283.5q-13 -5 -26 -5q-26 0 -45 19t-19 45q0 36 39 59q7 4 22.5 10.5t22.5 10.5q46 25 82 51q123 91 192 227t69 289t-69 289 t-192 227q-36 26 -82 51q-7 4 -22.5 10.5t-22.5 10.5q-39 23 -39 59q0 26 19 45t45 19q13 0 26 -5q211 -91 338 -283.5t127 -422.5z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M384 384v-128h-128v128h128zM384 1152v-128h-128v128h128zM1152 1152v-128h-128v128h128zM128 129h384v383h-384v-383zM128 896h384v384h-384v-384zM896 896h384v384h-384v-384zM640 640v-640h-640v640h640zM1152 128v-128h-128v128h128zM1408 128v-128h-128v128h128z M1408 640v-384h-384v128h-128v-384h-128v640h384v-128h128v128h128zM640 1408v-640h-640v640h640zM1408 1408v-640h-640v640h640z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M63 0h-63v1408h63v-1408zM126 1h-32v1407h32v-1407zM220 1h-31v1407h31v-1407zM377 1h-31v1407h31v-1407zM534 1h-62v1407h62v-1407zM660 1h-31v1407h31v-1407zM723 1h-31v1407h31v-1407zM786 1h-31v1407h31v-1407zM943 1h-63v1407h63v-1407zM1100 1h-63v1407h63v-1407z M1226 1h-63v1407h63v-1407zM1352 1h-63v1407h63v-1407zM1446 1h-63v1407h63v-1407zM1635 1h-94v1407h94v-1407zM1698 1h-32v1407h32v-1407zM1792 0h-63v1408h63v-1408z"/>
-   <glyph d="M448 1088q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1515 512q0 -53 -37 -90l-491 -492q-39 -37 -91 -37q-53 0 -90 37l-715 716q-38 37 -64.5 101t-26.5 117v416q0 52 38 90t90 38h416q53 0 117 -26.5t102 -64.5 l715 -714q37 -39 37 -91z" unicode=""/>
-   <glyph horiz-adv-x="1920" unicode="" d="M448 1088q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1515 512q0 -53 -37 -90l-491 -492q-39 -37 -91 -37q-53 0 -90 37l-715 716q-38 37 -64.5 101t-26.5 117v416q0 52 38 90t90 38h416q53 0 117 -26.5t102 -64.5 l715 -714q37 -39 37 -91zM1899 512q0 -53 -37 -90l-491 -492q-39 -37 -91 -37q-36 0 -59 14t-53 45l470 470q37 37 37 90q0 52 -37 91l-715 714q-38 38 -102 64.5t-117 26.5h224q53 0 117 -26.5t102 -64.5l715 -714q37 -39 37 -91z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1639 1058q40 -57 18 -129l-275 -906q-19 -64 -76.5 -107.5t-122.5 -43.5h-923q-77 0 -148.5 53.5t-99.5 131.5q-24 67 -2 127q0 4 3 27t4 37q1 8 -3 21.5t-3 19.5q2 11 8 21t16.5 23.5t16.5 23.5q23 38 45 91.5t30 91.5q3 10 0.5 30t-0.5 28q3 11 17 28t17 23 q21 36 42 92t25 90q1 9 -2.5 32t0.5 28q4 13 22 30.5t22 22.5q19 26 42.5 84.5t27.5 96.5q1 8 -3 25.5t-2 26.5q2 8 9 18t18 23t17 21q8 12 16.5 30.5t15 35t16 36t19.5 32t26.5 23.5t36 11.5t47.5 -5.5l-1 -3q38 9 51 9h761q74 0 114 -56t18 -130l-274 -906 q-36 -119 -71.5 -153.5t-128.5 -34.5h-869q-27 0 -38 -15q-11 -16 -1 -43q24 -70 144 -70h923q29 0 56 15.5t35 41.5l300 987q7 22 5 57q38 -15 59 -43zM575 1056q-4 -13 2 -22.5t20 -9.5h608q13 0 25.5 9.5t16.5 22.5l21 64q4 13 -2 22.5t-20 9.5h-608q-13 0 -25.5 -9.5 t-16.5 -22.5zM492 800q-4 -13 2 -22.5t20 -9.5h608q13 0 25.5 9.5t16.5 22.5l21 64q4 13 -2 22.5t-20 9.5h-608q-13 0 -25.5 -9.5t-16.5 -22.5z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1164 1408q23 0 44 -9q33 -13 52.5 -41t19.5 -62v-1289q0 -34 -19.5 -62t-52.5 -41q-19 -8 -44 -8q-48 0 -83 32l-441 424l-441 -424q-36 -33 -83 -33q-23 0 -44 9q-33 13 -52.5 41t-19.5 62v1289q0 34 19.5 62t52.5 41q21 9 44 9h1048z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M384 0h896v256h-896v-256zM384 640h896v384h-160q-40 0 -68 28t-28 68v160h-640v-640zM1536 576q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1664 576v-416q0 -13 -9.5 -22.5t-22.5 -9.5h-224v-160q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68 v160h-224q-13 0 -22.5 9.5t-9.5 22.5v416q0 79 56.5 135.5t135.5 56.5h64v544q0 40 28 68t68 28h672q40 0 88 -20t76 -48l152 -152q28 -28 48 -76t20 -88v-256h64q79 0 135.5 -56.5t56.5 -135.5z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M960 864q119 0 203.5 -84.5t84.5 -203.5t-84.5 -203.5t-203.5 -84.5t-203.5 84.5t-84.5 203.5t84.5 203.5t203.5 84.5zM1664 1280q106 0 181 -75t75 -181v-896q0 -106 -75 -181t-181 -75h-1408q-106 0 -181 75t-75 181v896q0 106 75 181t181 75h224l51 136 q19 49 69.5 84.5t103.5 35.5h512q53 0 103.5 -35.5t69.5 -84.5l51 -136h224zM960 128q185 0 316.5 131.5t131.5 316.5t-131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M725 977l-170 -450q33 0 136.5 -2t160.5 -2q19 0 57 2q-87 253 -184 452zM0 -128l2 79q23 7 56 12.5t57 10.5t49.5 14.5t44.5 29t31 50.5l237 616l280 724h75h53q8 -14 11 -21l205 -480q33 -78 106 -257.5t114 -274.5q15 -34 58 -144.5t72 -168.5q20 -45 35 -57 q19 -15 88 -29.5t84 -20.5q6 -38 6 -57q0 -4 -0.5 -13t-0.5 -13q-63 0 -190 8t-191 8q-76 0 -215 -7t-178 -8q0 43 4 78l131 28q1 0 12.5 2.5t15.5 3.5t14.5 4.5t15 6.5t11 8t9 11t2.5 14q0 16 -31 96.5t-72 177.5t-42 100l-450 2q-26 -58 -76.5 -195.5t-50.5 -162.5 q0 -22 14 -37.5t43.5 -24.5t48.5 -13.5t57 -8.5t41 -4q1 -19 1 -58q0 -9 -2 -27q-58 0 -174.5 10t-174.5 10q-8 0 -26.5 -4t-21.5 -4q-80 -14 -188 -14z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M555 15q74 -32 140 -32q376 0 376 335q0 114 -41 180q-27 44 -61.5 74t-67.5 46.5t-80.5 25t-84 10.5t-94.5 2q-73 0 -101 -10q0 -53 -0.5 -159t-0.5 -158q0 -8 -1 -67.5t-0.5 -96.5t4.5 -83.5t12 -66.5zM541 761q42 -7 109 -7q82 0 143 13t110 44.5t74.5 89.5t25.5 142 q0 70 -29 122.5t-79 82t-108 43.5t-124 14q-50 0 -130 -13q0 -50 4 -151t4 -152q0 -27 -0.5 -80t-0.5 -79q0 -46 1 -69zM0 -128l2 94q15 4 85 16t106 27q7 12 12.5 27t8.5 33.5t5.5 32.5t3 37.5t0.5 34v35.5v30q0 982 -22 1025q-4 8 -22 14.5t-44.5 11t-49.5 7t-48.5 4.5 t-30.5 3l-4 83q98 2 340 11.5t373 9.5q23 0 68.5 -0.5t67.5 -0.5q70 0 136.5 -13t128.5 -42t108 -71t74 -104.5t28 -137.5q0 -52 -16.5 -95.5t-39 -72t-64.5 -57.5t-73 -45t-84 -40q154 -35 256.5 -134t102.5 -248q0 -100 -35 -179.5t-93.5 -130.5t-138 -85.5t-163.5 -48.5 t-176 -14q-44 0 -132 3t-132 3q-106 0 -307 -11t-231 -12z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M0 -126l17 85q6 2 81.5 21.5t111.5 37.5q28 35 41 101q1 7 62 289t114 543.5t52 296.5v25q-24 13 -54.5 18.5t-69.5 8t-58 5.5l19 103q33 -2 120 -6.5t149.5 -7t120.5 -2.5q48 0 98.5 2.5t121 7t98.5 6.5q-5 -39 -19 -89q-30 -10 -101.5 -28.5t-108.5 -33.5 q-8 -19 -14 -42.5t-9 -40t-7.5 -45.5t-6.5 -42q-27 -148 -87.5 -419.5t-77.5 -355.5q-2 -9 -13 -58t-20 -90t-16 -83.5t-6 -57.5l1 -18q17 -4 185 -31q-3 -44 -16 -99q-11 0 -32.5 -1.5t-32.5 -1.5q-29 0 -87 10t-86 10q-138 2 -206 2q-51 0 -143 -9t-121 -11z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1744 128q33 0 42 -18.5t-11 -44.5l-126 -162q-20 -26 -49 -26t-49 26l-126 162q-20 26 -11 44.5t42 18.5h80v1024h-80q-33 0 -42 18.5t11 44.5l126 162q20 26 49 26t49 -26l126 -162q20 -26 11 -44.5t-42 -18.5h-80v-1024h80zM81 1407l54 -27q12 -5 211 -5q44 0 132 2 t132 2q36 0 107.5 -0.5t107.5 -0.5h293q6 0 21 -0.5t20.5 0t16 3t17.5 9t15 17.5l42 1q4 0 14 -0.5t14 -0.5q2 -112 2 -336q0 -80 -5 -109q-39 -14 -68 -18q-25 44 -54 128q-3 9 -11 48t-14.5 73.5t-7.5 35.5q-6 8 -12 12.5t-15.5 6t-13 2.5t-18 0.5t-16.5 -0.5 q-17 0 -66.5 0.5t-74.5 0.5t-64 -2t-71 -6q-9 -81 -8 -136q0 -94 2 -388t2 -455q0 -16 -2.5 -71.5t0 -91.5t12.5 -69q40 -21 124 -42.5t120 -37.5q5 -40 5 -50q0 -14 -3 -29l-34 -1q-76 -2 -218 8t-207 10q-50 0 -151 -9t-152 -9q-3 51 -3 52v9q17 27 61.5 43t98.5 29t78 27 q19 42 19 383q0 101 -3 303t-3 303v117q0 2 0.5 15.5t0.5 25t-1 25.5t-3 24t-5 14q-11 12 -162 12q-33 0 -93 -12t-80 -26q-19 -13 -34 -72.5t-31.5 -111t-42.5 -53.5q-42 26 -56 44v383z"/>
-   <glyph d="M81 1407l54 -27q12 -5 211 -5q44 0 132 2t132 2q70 0 246.5 1t304.5 0.5t247 -4.5q33 -1 56 31l42 1q4 0 14 -0.5t14 -0.5q2 -112 2 -336q0 -80 -5 -109q-39 -14 -68 -18q-25 44 -54 128q-3 9 -11 47.5t-15 73.5t-7 36q-10 13 -27 19q-5 2 -66 2q-30 0 -93 1t-103 1 t-94 -2t-96 -7q-9 -81 -8 -136l1 -152v52q0 -55 1 -154t1.5 -180t0.5 -153q0 -16 -2.5 -71.5t0 -91.5t12.5 -69q40 -21 124 -42.5t120 -37.5q5 -40 5 -50q0 -14 -3 -29l-34 -1q-76 -2 -218 8t-207 10q-50 0 -151 -9t-152 -9q-3 51 -3 52v9q17 27 61.5 43t98.5 29t78 27 q7 16 11.5 74t6 145.5t1.5 155t-0.5 153.5t-0.5 89q0 7 -2.5 21.5t-2.5 22.5q0 7 0.5 44t1 73t0 76.5t-3 67.5t-6.5 32q-11 12 -162 12q-41 0 -163 -13.5t-138 -24.5q-19 -12 -34 -71.5t-31.5 -111.5t-42.5 -54q-42 26 -56 44v383zM1310 125q12 0 42 -19.5t57.5 -41.5 t59.5 -49t36 -30q26 -21 26 -49t-26 -49q-4 -3 -36 -30t-59.5 -49t-57.5 -41.5t-42 -19.5q-13 0 -20.5 10.5t-10 28.5t-2.5 33.5t1.5 33t1.5 19.5h-1024q0 -2 1.5 -19.5t1.5 -33t-2.5 -33.5t-10 -28.5t-20.5 -10.5q-12 0 -42 19.5t-57.5 41.5t-59.5 49t-36 30q-26 21 -26 49 t26 49q4 3 36 30t59.5 49t57.5 41.5t42 19.5q13 0 20.5 -10.5t10 -28.5t2.5 -33.5t-1.5 -33t-1.5 -19.5h1024q0 2 -1.5 19.5t-1.5 33t2.5 33.5t10 28.5t20.5 10.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 192v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1408 576v-128q0 -26 -19 -45t-45 -19h-1280q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1280q26 0 45 -19t19 -45zM1664 960v-128q0 -26 -19 -45 t-45 -19h-1536q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1536q26 0 45 -19t19 -45zM1280 1344v-128q0 -26 -19 -45t-45 -19h-1152q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1152q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 192v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1408 576v-128q0 -26 -19 -45t-45 -19h-896q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h896q26 0 45 -19t19 -45zM1664 960v-128q0 -26 -19 -45t-45 -19 h-1408q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1408q26 0 45 -19t19 -45zM1280 1344v-128q0 -26 -19 -45t-45 -19h-640q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h640q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 192v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 576v-128q0 -26 -19 -45t-45 -19h-1280q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1280q26 0 45 -19t19 -45zM1792 960v-128q0 -26 -19 -45 t-45 -19h-1536q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1536q26 0 45 -19t19 -45zM1792 1344v-128q0 -26 -19 -45t-45 -19h-1152q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1152q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 192v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 576v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 960v-128q0 -26 -19 -45 t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 1344v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M256 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-192q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h192q13 0 22.5 -9.5t9.5 -22.5zM256 608v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-192q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h192q13 0 22.5 -9.5 t9.5 -22.5zM256 992v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-192q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h192q13 0 22.5 -9.5t9.5 -22.5zM1792 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1344 q13 0 22.5 -9.5t9.5 -22.5zM256 1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-192q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h192q13 0 22.5 -9.5t9.5 -22.5zM1792 608v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5 t22.5 9.5h1344q13 0 22.5 -9.5t9.5 -22.5zM1792 992v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1344q13 0 22.5 -9.5t9.5 -22.5zM1792 1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v192 q0 13 9.5 22.5t22.5 9.5h1344q13 0 22.5 -9.5t9.5 -22.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M384 992v-576q0 -13 -9.5 -22.5t-22.5 -9.5q-14 0 -23 9l-288 288q-9 9 -9 23t9 23l288 288q9 9 23 9q13 0 22.5 -9.5t9.5 -22.5zM1792 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1728q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1728q13 0 22.5 -9.5 t9.5 -22.5zM1792 608v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1088q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1088q13 0 22.5 -9.5t9.5 -22.5zM1792 992v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1088q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1088 q13 0 22.5 -9.5t9.5 -22.5zM1792 1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1728q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1728q13 0 22.5 -9.5t9.5 -22.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M352 704q0 -14 -9 -23l-288 -288q-9 -9 -23 -9q-13 0 -22.5 9.5t-9.5 22.5v576q0 13 9.5 22.5t22.5 9.5q14 0 23 -9l288 -288q9 -9 9 -23zM1792 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1728q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1728q13 0 22.5 -9.5 t9.5 -22.5zM1792 608v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1088q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1088q13 0 22.5 -9.5t9.5 -22.5zM1792 992v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1088q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1088 q13 0 22.5 -9.5t9.5 -22.5zM1792 1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1728q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1728q13 0 22.5 -9.5t9.5 -22.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 1184v-1088q0 -42 -39 -59q-13 -5 -25 -5q-27 0 -45 19l-403 403v-166q0 -119 -84.5 -203.5t-203.5 -84.5h-704q-119 0 -203.5 84.5t-84.5 203.5v704q0 119 84.5 203.5t203.5 84.5h704q119 0 203.5 -84.5t84.5 -203.5v-165l403 402q18 19 45 19q12 0 25 -5 q39 -17 39 -59z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M640 960q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM1664 576v-448h-1408v192l320 320l160 -160l512 512zM1760 1280h-1600q-13 0 -22.5 -9.5t-9.5 -22.5v-1216q0 -13 9.5 -22.5t22.5 -9.5h1600q13 0 22.5 9.5t9.5 22.5v1216 q0 13 -9.5 22.5t-22.5 9.5zM1920 1248v-1216q0 -66 -47 -113t-113 -47h-1600q-66 0 -113 47t-47 113v1216q0 66 47 113t113 47h1600q66 0 113 -47t47 -113z"/>
-   <glyph d="M363 0l91 91l-235 235l-91 -91v-107h128v-128h107zM886 928q0 22 -22 22q-10 0 -17 -7l-542 -542q-7 -7 -7 -17q0 -22 22 -22q10 0 17 7l542 542q7 7 7 17zM832 1120l416 -416l-832 -832h-416v416zM1515 1024q0 -53 -37 -90l-166 -166l-416 416l166 165q36 38 90 38 q53 0 91 -38l235 -234q37 -39 37 -91z" unicode=""/>
-   <glyph horiz-adv-x="1024" unicode="" d="M768 896q0 106 -75 181t-181 75t-181 -75t-75 -181t75 -181t181 -75t181 75t75 181zM1024 896q0 -109 -33 -179l-364 -774q-16 -33 -47.5 -52t-67.5 -19t-67.5 19t-46.5 52l-365 774q-33 70 -33 179q0 212 150 362t362 150t362 -150t150 -362z"/>
-   <glyph d="M768 96v1088q-148 0 -273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1024" unicode="" d="M512 384q0 36 -20 69q-1 1 -15.5 22.5t-25.5 38t-25 44t-21 50.5q-4 16 -21 16t-21 -16q-7 -23 -21 -50.5t-25 -44t-25.5 -38t-15.5 -22.5q-20 -33 -20 -69q0 -53 37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1024 512q0 -212 -150 -362t-362 -150t-362 150t-150 362 q0 145 81 275q6 9 62.5 90.5t101 151t99.5 178t83 201.5q9 30 34 47t51 17t51.5 -17t33.5 -47q28 -93 83 -201.5t99.5 -178t101 -151t62.5 -90.5q81 -127 81 -275z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M888 352l116 116l-152 152l-116 -116v-56h96v-96h56zM1328 1072q-16 16 -33 -1l-350 -350q-17 -17 -1 -33t33 1l350 350q17 17 1 33zM1408 478v-190q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h832 q63 0 117 -25q15 -7 18 -23q3 -17 -9 -29l-49 -49q-14 -14 -32 -8q-23 6 -45 6h-832q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832q66 0 113 47t47 113v126q0 13 9 22l64 64q15 15 35 7t20 -29zM1312 1216l288 -288l-672 -672h-288v288zM1756 1084l-92 -92 l-288 288l92 92q28 28 68 28t68 -28l152 -152q28 -28 28 -68t-28 -68z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1408 547v-259q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h255v0q13 0 22.5 -9.5t9.5 -22.5q0 -27 -26 -32q-77 -26 -133 -60q-10 -4 -16 -4h-112q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832 q66 0 113 47t47 113v214q0 19 18 29q28 13 54 37q16 16 35 8q21 -9 21 -29zM1645 1043l-384 -384q-18 -19 -45 -19q-12 0 -25 5q-39 17 -39 59v192h-160q-323 0 -438 -131q-119 -137 -74 -473q3 -23 -20 -34q-8 -2 -12 -2q-16 0 -26 13q-10 14 -21 31t-39.5 68.5t-49.5 99.5 t-38.5 114t-17.5 122q0 49 3.5 91t14 90t28 88t47 81.5t68.5 74t94.5 61.5t124.5 48.5t159.5 30.5t196.5 11h160v192q0 42 39 59q13 5 25 5q26 0 45 -19l384 -384q19 -19 19 -45t-19 -45z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1408 606v-318q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h832q63 0 117 -25q15 -7 18 -23q3 -17 -9 -29l-49 -49q-10 -10 -23 -10q-3 0 -9 2q-23 6 -45 6h-832q-66 0 -113 -47t-47 -113v-832 q0 -66 47 -113t113 -47h832q66 0 113 47t47 113v254q0 13 9 22l64 64q10 10 23 10q6 0 12 -3q20 -8 20 -29zM1639 1095l-814 -814q-24 -24 -57 -24t-57 24l-430 430q-24 24 -24 57t24 57l110 110q24 24 57 24t57 -24l263 -263l647 647q24 24 57 24t57 -24l110 -110 q24 -24 24 -57t-24 -57z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 640q0 -26 -19 -45l-256 -256q-19 -19 -45 -19t-45 19t-19 45v128h-384v-384h128q26 0 45 -19t19 -45t-19 -45l-256 -256q-19 -19 -45 -19t-45 19l-256 256q-19 19 -19 45t19 45t45 19h128v384h-384v-128q0 -26 -19 -45t-45 -19t-45 19l-256 256q-19 19 -19 45 t19 45l256 256q19 19 45 19t45 -19t19 -45v-128h384v384h-128q-26 0 -45 19t-19 45t19 45l256 256q19 19 45 19t45 -19l256 -256q19 -19 19 -45t-19 -45t-45 -19h-128v-384h384v128q0 26 19 45t45 19t45 -19l256 -256q19 -19 19 -45z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M979 1395q19 19 32 13t13 -32v-1472q0 -26 -13 -32t-32 13l-710 710q-9 9 -13 19v-678q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-678q4 11 13 19z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1747 1395q19 19 32 13t13 -32v-1472q0 -26 -13 -32t-32 13l-710 710q-9 9 -13 19v-710q0 -26 -13 -32t-32 13l-710 710q-9 9 -13 19v-678q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-678q4 11 13 19l710 710 q19 19 32 13t13 -32v-710q4 11 13 19z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1619 1395q19 19 32 13t13 -32v-1472q0 -26 -13 -32t-32 13l-710 710q-8 9 -13 19v-710q0 -26 -13 -32t-32 13l-710 710q-19 19 -19 45t19 45l710 710q19 19 32 13t13 -32v-710q5 11 13 19z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1384 609l-1328 -738q-23 -13 -39.5 -3t-16.5 36v1472q0 26 16.5 36t39.5 -3l1328 -738q23 -13 23 -31t-23 -31z"/>
-   <glyph d="M1536 1344v-1408q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h512q26 0 45 -19t19 -45zM640 1344v-1408q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h512q26 0 45 -19t19 -45z" unicode=""/>
-   <glyph d="M1536 1344v-1408q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h1408q26 0 45 -19t19 -45z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M45 -115q-19 -19 -32 -13t-13 32v1472q0 26 13 32t32 -13l710 -710q8 -8 13 -19v710q0 26 13 32t32 -13l710 -710q19 -19 19 -45t-19 -45l-710 -710q-19 -19 -32 -13t-13 32v710q-5 -10 -13 -19z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M45 -115q-19 -19 -32 -13t-13 32v1472q0 26 13 32t32 -13l710 -710q8 -8 13 -19v710q0 26 13 32t32 -13l710 -710q8 -8 13 -19v678q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-1408q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v678q-5 -10 -13 -19l-710 -710 q-19 -19 -32 -13t-13 32v710q-5 -10 -13 -19z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M45 -115q-19 -19 -32 -13t-13 32v1472q0 26 13 32t32 -13l710 -710q8 -8 13 -19v678q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-1408q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v678q-5 -10 -13 -19z"/>
-   <glyph horiz-adv-x="1538" unicode="" d="M14 557l710 710q19 19 45 19t45 -19l710 -710q19 -19 13 -32t-32 -13h-1472q-26 0 -32 13t13 32zM1473 0h-1408q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h1408q26 0 45 -19t19 -45v-256q0 -26 -19 -45t-45 -19z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1171 1235l-531 -531l531 -531q19 -19 19 -45t-19 -45l-166 -166q-19 -19 -45 -19t-45 19l-742 742q-19 19 -19 45t19 45l742 742q19 19 45 19t45 -19l166 -166q19 -19 19 -45t-19 -45z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1107 659l-742 -742q-19 -19 -45 -19t-45 19l-166 166q-19 19 -19 45t19 45l531 531l-531 531q-19 19 -19 45t19 45l166 166q19 19 45 19t45 -19l742 -742q19 -19 19 -45t-19 -45z"/>
-   <glyph d="M1216 576v128q0 26 -19 45t-45 19h-256v256q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-256h-256q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h256v-256q0 -26 19 -45t45 -19h128q26 0 45 19t19 45v256h256q26 0 45 19t19 45zM1536 640q0 -209 -103 -385.5 t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1216 576v128q0 26 -19 45t-45 19h-768q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h768q26 0 45 19t19 45zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5 t103 -385.5z" unicode=""/>
-   <glyph d="M1149 414q0 26 -19 45l-181 181l181 181q19 19 19 45q0 27 -19 46l-90 90q-19 19 -46 19q-26 0 -45 -19l-181 -181l-181 181q-19 19 -45 19q-27 0 -46 -19l-90 -90q-19 -19 -19 -46q0 -26 19 -45l181 -181l-181 -181q-19 -19 -19 -45q0 -27 19 -46l90 -90q19 -19 46 -19 q26 0 45 19l181 181l181 -181q19 -19 45 -19q27 0 46 19l90 90q19 19 19 46zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1284 802q0 28 -18 46l-91 90q-19 19 -45 19t-45 -19l-408 -407l-226 226q-19 19 -45 19t-45 -19l-91 -90q-18 -18 -18 -46q0 -27 18 -45l362 -362q19 -19 45 -19q27 0 46 19l543 543q18 18 18 45zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103 t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M896 160v192q0 14 -9 23t-23 9h-192q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h192q14 0 23 9t9 23zM1152 832q0 88 -55.5 163t-138.5 116t-170 41q-243 0 -371 -213q-15 -24 8 -42l132 -100q7 -6 19 -6q16 0 25 12q53 68 86 92q34 24 86 24q48 0 85.5 -26t37.5 -59 q0 -38 -20 -61t-68 -45q-63 -28 -115.5 -86.5t-52.5 -125.5v-36q0 -14 9 -23t23 -9h192q14 0 23 9t9 23q0 19 21.5 49.5t54.5 49.5q32 18 49 28.5t46 35t44.5 48t28 60.5t12.5 81zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1024 160v160q0 14 -9 23t-23 9h-96v512q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-160q0 -14 9 -23t23 -9h96v-320h-96q-14 0 -23 -9t-9 -23v-160q0 -14 9 -23t23 -9h448q14 0 23 9t9 23zM896 1056v160q0 14 -9 23t-23 9h-192q-14 0 -23 -9t-9 -23v-160q0 -14 9 -23 t23 -9h192q14 0 23 9t9 23zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1197 512h-109q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h109q-32 108 -112.5 188.5t-188.5 112.5v-109q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v109q-108 -32 -188.5 -112.5t-112.5 -188.5h109q26 0 45 -19t19 -45v-128q0 -26 -19 -45t-45 -19h-109 q32 -108 112.5 -188.5t188.5 -112.5v109q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-109q108 32 188.5 112.5t112.5 188.5zM1536 704v-128q0 -26 -19 -45t-45 -19h-143q-37 -161 -154.5 -278.5t-278.5 -154.5v-143q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v143 q-161 37 -278.5 154.5t-154.5 278.5h-143q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h143q37 161 154.5 278.5t278.5 154.5v143q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-143q161 -37 278.5 -154.5t154.5 -278.5h143q26 0 45 -19t19 -45z" unicode=""/>
-   <glyph d="M1097 457l-146 -146q-10 -10 -23 -10t-23 10l-137 137l-137 -137q-10 -10 -23 -10t-23 10l-146 146q-10 10 -10 23t10 23l137 137l-137 137q-10 10 -10 23t10 23l146 146q10 10 23 10t23 -10l137 -137l137 137q10 10 23 10t23 -10l146 -146q10 -10 10 -23t-10 -23 l-137 -137l137 -137q10 -10 10 -23t-10 -23zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5 t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1171 723l-422 -422q-19 -19 -45 -19t-45 19l-294 294q-19 19 -19 45t19 45l102 102q19 19 45 19t45 -19l147 -147l275 275q19 19 45 19t45 -19l102 -102q19 -19 19 -45t-19 -45zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198 t273 -73t273 73t198 198t73 273zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1312 643q0 161 -87 295l-754 -753q137 -89 297 -89q111 0 211.5 43.5t173.5 116.5t116 174.5t43 212.5zM313 344l755 754q-135 91 -300 91q-148 0 -273 -73t-198 -199t-73 -274q0 -162 89 -299zM1536 643q0 -157 -61 -300t-163.5 -246t-245 -164t-298.5 -61t-298.5 61 t-245 164t-163.5 246t-61 300t61 299.5t163.5 245.5t245 164t298.5 61t298.5 -61t245 -164t163.5 -245.5t61 -299.5z" unicode=""/>
-   <glyph d="M1536 640v-128q0 -53 -32.5 -90.5t-84.5 -37.5h-704l293 -294q38 -36 38 -90t-38 -90l-75 -76q-37 -37 -90 -37q-52 0 -91 37l-651 652q-37 37 -37 90q0 52 37 91l651 650q38 38 91 38q52 0 90 -38l75 -74q38 -38 38 -91t-38 -91l-293 -293h704q52 0 84.5 -37.5 t32.5 -90.5z" unicode=""/>
-   <glyph d="M1472 576q0 -54 -37 -91l-651 -651q-39 -37 -91 -37q-51 0 -90 37l-75 75q-38 38 -38 91t38 91l293 293h-704q-52 0 -84.5 37.5t-32.5 90.5v128q0 53 32.5 90.5t84.5 37.5h704l-293 294q-38 36 -38 90t38 90l75 75q38 38 90 38q53 0 91 -38l651 -651q37 -35 37 -90z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1611 565q0 -51 -37 -90l-75 -75q-38 -38 -91 -38q-54 0 -90 38l-294 293v-704q0 -52 -37.5 -84.5t-90.5 -32.5h-128q-53 0 -90.5 32.5t-37.5 84.5v704l-294 -293q-36 -38 -90 -38t-90 38l-75 75q-38 38 -38 90q0 53 38 91l651 651q35 37 90 37q54 0 91 -37l651 -651 q37 -39 37 -91z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1611 704q0 -53 -37 -90l-651 -652q-39 -37 -91 -37q-53 0 -90 37l-651 652q-38 36 -38 90q0 53 38 91l74 75q39 37 91 37q53 0 90 -37l294 -294v704q0 52 38 90t90 38h128q52 0 90 -38t38 -90v-704l294 294q37 37 90 37q52 0 91 -37l75 -75q37 -39 37 -91z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 896q0 -26 -19 -45l-512 -512q-19 -19 -45 -19t-45 19t-19 45v256h-224q-98 0 -175.5 -6t-154 -21.5t-133 -42.5t-105.5 -69.5t-80 -101t-48.5 -138.5t-17.5 -181q0 -55 5 -123q0 -6 2.5 -23.5t2.5 -26.5q0 -15 -8.5 -25t-23.5 -10q-16 0 -28 17q-7 9 -13 22 t-13.5 30t-10.5 24q-127 285 -127 451q0 199 53 333q162 403 875 403h224v256q0 26 19 45t45 19t45 -19l512 -512q19 -19 19 -45z"/>
-   <glyph d="M755 480q0 -13 -10 -23l-332 -332l144 -144q19 -19 19 -45t-19 -45t-45 -19h-448q-26 0 -45 19t-19 45v448q0 26 19 45t45 19t45 -19l144 -144l332 332q10 10 23 10t23 -10l114 -114q10 -10 10 -23zM1536 1344v-448q0 -26 -19 -45t-45 -19t-45 19l-144 144l-332 -332 q-10 -10 -23 -10t-23 10l-114 114q-10 10 -10 23t10 23l332 332l-144 144q-19 19 -19 45t19 45t45 19h448q26 0 45 -19t19 -45z" unicode=""/>
-   <glyph d="M768 576v-448q0 -26 -19 -45t-45 -19t-45 19l-144 144l-332 -332q-10 -10 -23 -10t-23 10l-114 114q-10 10 -10 23t10 23l332 332l-144 144q-19 19 -19 45t19 45t45 19h448q26 0 45 -19t19 -45zM1523 1248q0 -13 -10 -23l-332 -332l144 -144q19 -19 19 -45t-19 -45 t-45 -19h-448q-26 0 -45 19t-19 45v448q0 26 19 45t45 19t45 -19l144 -144l332 332q10 10 23 10t23 -10l114 -114q10 -10 10 -23z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1408 800v-192q0 -40 -28 -68t-68 -28h-416v-416q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v416h-416q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h416v416q0 40 28 68t68 28h192q40 0 68 -28t28 -68v-416h416q40 0 68 -28t28 -68z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1408 800v-192q0 -40 -28 -68t-68 -28h-1216q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h1216q40 0 68 -28t28 -68z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1482 486q46 -26 59.5 -77.5t-12.5 -97.5l-64 -110q-26 -46 -77.5 -59.5t-97.5 12.5l-266 153v-307q0 -52 -38 -90t-90 -38h-128q-52 0 -90 38t-38 90v307l-266 -153q-46 -26 -97.5 -12.5t-77.5 59.5l-64 110q-26 46 -12.5 97.5t59.5 77.5l266 154l-266 154 q-46 26 -59.5 77.5t12.5 97.5l64 110q26 46 77.5 59.5t97.5 -12.5l266 -153v307q0 52 38 90t90 38h128q52 0 90 -38t38 -90v-307l266 153q46 26 97.5 12.5t77.5 -59.5l64 -110q26 -46 12.5 -97.5t-59.5 -77.5l-266 -154z"/>
-   <glyph d="M768 1408q209 0 385.5 -103t279.5 -279.5t103 -385.5t-103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103zM896 161v190q0 14 -9 23.5t-22 9.5h-192q-13 0 -23 -10t-10 -23v-190q0 -13 10 -23t23 -10h192 q13 0 22 9.5t9 23.5zM894 505l18 621q0 12 -10 18q-10 8 -24 8h-220q-14 0 -24 -8q-10 -6 -10 -18l17 -621q0 -10 10 -17.5t24 -7.5h185q14 0 23.5 7.5t10.5 17.5z" unicode=""/>
-   <glyph d="M928 180v56v468v192h-320v-192v-468v-56q0 -25 18 -38.5t46 -13.5h192q28 0 46 13.5t18 38.5zM472 1024h195l-126 161q-26 31 -69 31q-40 0 -68 -28t-28 -68t28 -68t68 -28zM1160 1120q0 40 -28 68t-68 28q-43 0 -69 -31l-125 -161h194q40 0 68 28t28 68zM1536 864v-320 q0 -14 -9 -23t-23 -9h-96v-416q0 -40 -28 -68t-68 -28h-1088q-40 0 -68 28t-28 68v416h-96q-14 0 -23 9t-9 23v320q0 14 9 23t23 9h440q-93 0 -158.5 65.5t-65.5 158.5t65.5 158.5t158.5 65.5q107 0 168 -77l128 -165l128 165q61 77 168 77q93 0 158.5 -65.5t65.5 -158.5 t-65.5 -158.5t-158.5 -65.5h440q14 0 23 -9t9 -23z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1280 832q0 26 -19 45t-45 19q-172 0 -318 -49.5t-259.5 -134t-235.5 -219.5q-19 -21 -19 -45q0 -26 19 -45t45 -19q24 0 45 19q27 24 74 71t67 66q137 124 268.5 176t313.5 52q26 0 45 19t19 45zM1792 1030q0 -95 -20 -193q-46 -224 -184.5 -383t-357.5 -268 q-214 -108 -438 -108q-148 0 -286 47q-15 5 -88 42t-96 37q-16 0 -39.5 -32t-45 -70t-52.5 -70t-60 -32q-30 0 -51 11t-31 24t-27 42q-2 4 -6 11t-5.5 10t-3 9.5t-1.5 13.5q0 35 31 73.5t68 65.5t68 56t31 48q0 4 -14 38t-16 44q-9 51 -9 104q0 115 43.5 220t119 184.5 t170.5 139t204 95.5q55 18 145 25.5t179.5 9t178.5 6t163.5 24t113.5 56.5l29.5 29.5t29.5 28t27 20t36.5 16t43.5 4.5q39 0 70.5 -46t47.5 -112t24 -124t8 -96z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1408 -160v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h1344q13 0 22.5 -9.5t9.5 -22.5zM1152 896q0 -78 -24.5 -144t-64 -112.5t-87.5 -88t-96 -77.5t-87.5 -72t-64 -81.5t-24.5 -96.5q0 -96 67 -224l-4 1l1 -1 q-90 41 -160 83t-138.5 100t-113.5 122.5t-72.5 150.5t-27.5 184q0 78 24.5 144t64 112.5t87.5 88t96 77.5t87.5 72t64 81.5t24.5 96.5q0 94 -66 224l3 -1l-1 1q90 -41 160 -83t138.5 -100t113.5 -122.5t72.5 -150.5t27.5 -184z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1664 576q-152 236 -381 353q61 -104 61 -225q0 -185 -131.5 -316.5t-316.5 -131.5t-316.5 131.5t-131.5 316.5q0 121 61 225q-229 -117 -381 -353q133 -205 333.5 -326.5t434.5 -121.5t434.5 121.5t333.5 326.5zM944 960q0 20 -14 34t-34 14q-125 0 -214.5 -89.5 t-89.5 -214.5q0 -20 14 -34t34 -14t34 14t14 34q0 86 61 147t147 61q20 0 34 14t14 34zM1792 576q0 -34 -20 -69q-140 -230 -376.5 -368.5t-499.5 -138.5t-499.5 139t-376.5 368q-20 35 -20 69t20 69q140 229 376.5 368t499.5 139t499.5 -139t376.5 -368q20 -35 20 -69z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M555 201l78 141q-87 63 -136 159t-49 203q0 121 61 225q-229 -117 -381 -353q167 -258 427 -375zM944 960q0 20 -14 34t-34 14q-125 0 -214.5 -89.5t-89.5 -214.5q0 -20 14 -34t34 -14t34 14t14 34q0 86 61 147t147 61q20 0 34 14t14 34zM1307 1151q0 -7 -1 -9 q-105 -188 -315 -566t-316 -567l-49 -89q-10 -16 -28 -16q-12 0 -134 70q-16 10 -16 28q0 12 44 87q-143 65 -263.5 173t-208.5 245q-20 31 -20 69t20 69q153 235 380 371t496 136q89 0 180 -17l54 97q10 16 28 16q5 0 18 -6t31 -15.5t33 -18.5t31.5 -18.5t19.5 -11.5 q16 -10 16 -27zM1344 704q0 -139 -79 -253.5t-209 -164.5l280 502q8 -45 8 -84zM1792 576q0 -35 -20 -69q-39 -64 -109 -145q-150 -172 -347.5 -267t-419.5 -95l74 132q212 18 392.5 137t301.5 307q-115 179 -282 294l63 112q95 -64 182.5 -153t144.5 -184q20 -34 20 -69z "/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1024 161v190q0 14 -9.5 23.5t-22.5 9.5h-192q-13 0 -22.5 -9.5t-9.5 -23.5v-190q0 -14 9.5 -23.5t22.5 -9.5h192q13 0 22.5 9.5t9.5 23.5zM1022 535l18 459q0 12 -10 19q-13 11 -24 11h-220q-11 0 -24 -11q-10 -7 -10 -21l17 -457q0 -10 10 -16.5t24 -6.5h185 q14 0 23.5 6.5t10.5 16.5zM1008 1469l768 -1408q35 -63 -2 -126q-17 -29 -46.5 -46t-63.5 -17h-1536q-34 0 -63.5 17t-46.5 46q-37 63 -2 126l768 1408q17 31 47 49t65 18t65 -18t47 -49z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1376 1376q44 -52 12 -148t-108 -172l-161 -161l160 -696q5 -19 -12 -33l-128 -96q-7 -6 -19 -6q-4 0 -7 1q-15 3 -21 16l-279 508l-259 -259l53 -194q5 -17 -8 -31l-96 -96q-9 -9 -23 -9h-2q-15 2 -24 13l-189 252l-252 189q-11 7 -13 23q-1 13 9 25l96 97q9 9 23 9 q6 0 8 -1l194 -53l259 259l-508 279q-14 8 -17 24q-2 16 9 27l128 128q14 13 30 8l665 -159l160 160q76 76 172 108t148 -12z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M128 -128h288v288h-288v-288zM480 -128h320v288h-320v-288zM128 224h288v320h-288v-320zM480 224h320v320h-320v-320zM128 608h288v288h-288v-288zM864 -128h320v288h-320v-288zM480 608h320v288h-320v-288zM1248 -128h288v288h-288v-288zM864 224h320v320h-320v-320z M512 1088v288q0 13 -9.5 22.5t-22.5 9.5h-64q-13 0 -22.5 -9.5t-9.5 -22.5v-288q0 -13 9.5 -22.5t22.5 -9.5h64q13 0 22.5 9.5t9.5 22.5zM1248 224h288v320h-288v-320zM864 608h320v288h-320v-288zM1248 608h288v288h-288v-288zM1280 1088v288q0 13 -9.5 22.5t-22.5 9.5h-64 q-13 0 -22.5 -9.5t-9.5 -22.5v-288q0 -13 9.5 -22.5t22.5 -9.5h64q13 0 22.5 9.5t9.5 22.5zM1664 1152v-1280q0 -52 -38 -90t-90 -38h-1408q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h128v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h384v96q0 66 47 113t113 47 h64q66 0 113 -47t47 -113v-96h128q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M666 1055q-60 -92 -137 -273q-22 45 -37 72.5t-40.5 63.5t-51 56.5t-63 35t-81.5 14.5h-224q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h224q250 0 410 -225zM1792 256q0 -14 -9 -23l-320 -320q-9 -9 -23 -9q-13 0 -22.5 9.5t-9.5 22.5v192q-32 0 -85 -0.5t-81 -1t-73 1 t-71 5t-64 10.5t-63 18.5t-58 28.5t-59 40t-55 53.5t-56 69.5q59 93 136 273q22 -45 37 -72.5t40.5 -63.5t51 -56.5t63 -35t81.5 -14.5h256v192q0 14 9 23t23 9q12 0 24 -10l319 -319q9 -9 9 -23zM1792 1152q0 -14 -9 -23l-320 -320q-9 -9 -23 -9q-13 0 -22.5 9.5t-9.5 22.5 v192h-256q-48 0 -87 -15t-69 -45t-51 -61.5t-45 -77.5q-32 -62 -78 -171q-29 -66 -49.5 -111t-54 -105t-64 -100t-74 -83t-90 -68.5t-106.5 -42t-128 -16.5h-224q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h224q48 0 87 15t69 45t51 61.5t45 77.5q32 62 78 171q29 66 49.5 111 t54 105t64 100t74 83t90 68.5t106.5 42t128 16.5h256v192q0 14 9 23t23 9q12 0 24 -10l319 -319q9 -9 9 -23z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 640q0 -174 -120 -321.5t-326 -233t-450 -85.5q-70 0 -145 8q-198 -175 -460 -242q-49 -14 -114 -22q-17 -2 -30.5 9t-17.5 29v1q-3 4 -0.5 12t2 10t4.5 9.5l6 9t7 8.5t8 9q7 8 31 34.5t34.5 38t31 39.5t32.5 51t27 59t26 76q-157 89 -247.5 220t-90.5 281 q0 130 71 248.5t191 204.5t286 136.5t348 50.5q244 0 450 -85.5t326 -233t120 -321.5z"/>
-   <glyph d="M1536 704v-128q0 -201 -98.5 -362t-274 -251.5t-395.5 -90.5t-395.5 90.5t-274 251.5t-98.5 362v128q0 26 19 45t45 19h384q26 0 45 -19t19 -45v-128q0 -52 23.5 -90t53.5 -57t71 -30t64 -13t44 -2t44 2t64 13t71 30t53.5 57t23.5 90v128q0 26 19 45t45 19h384 q26 0 45 -19t19 -45zM512 1344v-384q0 -26 -19 -45t-45 -19h-384q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h384q26 0 45 -19t19 -45zM1536 1344v-384q0 -26 -19 -45t-45 -19h-384q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h384q26 0 45 -19t19 -45z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1683 205l-166 -165q-19 -19 -45 -19t-45 19l-531 531l-531 -531q-19 -19 -45 -19t-45 19l-166 165q-19 19 -19 45.5t19 45.5l742 741q19 19 45 19t45 -19l742 -741q19 -19 19 -45.5t-19 -45.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1683 728l-742 -741q-19 -19 -45 -19t-45 19l-742 741q-19 19 -19 45.5t19 45.5l166 165q19 19 45 19t45 -19l531 -531l531 531q19 19 45 19t45 -19l166 -165q19 -19 19 -45.5t-19 -45.5z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1280 32q0 -13 -9.5 -22.5t-22.5 -9.5h-960q-8 0 -13.5 2t-9 7t-5.5 8t-3 11.5t-1 11.5v13v11v160v416h-192q-26 0 -45 19t-19 45q0 24 15 41l320 384q19 22 49 22t49 -22l320 -384q15 -17 15 -41q0 -26 -19 -45t-45 -19h-192v-384h576q16 0 25 -11l160 -192q7 -11 7 -21 zM1920 448q0 -24 -15 -41l-320 -384q-20 -23 -49 -23t-49 23l-320 384q-15 17 -15 41q0 26 19 45t45 19h192v384h-576q-16 0 -25 12l-160 192q-7 9 -7 20q0 13 9.5 22.5t22.5 9.5h960q8 0 13.5 -2t9 -7t5.5 -8t3 -11.5t1 -11.5v-13v-11v-160v-416h192q26 0 45 -19t19 -45z "/>
-   <glyph horiz-adv-x="1664" unicode="" d="M640 0q0 -52 -38 -90t-90 -38t-90 38t-38 90t38 90t90 38t90 -38t38 -90zM1536 0q0 -52 -38 -90t-90 -38t-90 38t-38 90t38 90t90 38t90 -38t38 -90zM1664 1088v-512q0 -24 -16.5 -42.5t-40.5 -21.5l-1044 -122q13 -60 13 -70q0 -16 -24 -64h920q26 0 45 -19t19 -45 t-19 -45t-45 -19h-1024q-26 0 -45 19t-19 45q0 11 8 31.5t16 36t21.5 40t15.5 29.5l-177 823h-204q-26 0 -45 19t-19 45t19 45t45 19h256q16 0 28.5 -6.5t19.5 -15.5t13 -24.5t8 -26t5.5 -29.5t4.5 -26h1201q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1664 928v-704q0 -92 -66 -158t-158 -66h-1216q-92 0 -158 66t-66 158v960q0 92 66 158t158 66h320q92 0 158 -66t66 -158v-32h672q92 0 158 -66t66 -158z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1879 584q0 -31 -31 -66l-336 -396q-43 -51 -120.5 -86.5t-143.5 -35.5h-1088q-34 0 -60.5 13t-26.5 43q0 31 31 66l336 396q43 51 120.5 86.5t143.5 35.5h1088q34 0 60.5 -13t26.5 -43zM1536 928v-160h-832q-94 0 -197 -47.5t-164 -119.5l-337 -396l-5 -6q0 4 -0.5 12.5 t-0.5 12.5v960q0 92 66 158t158 66h320q92 0 158 -66t66 -158v-32h544q92 0 158 -66t66 -158z"/>
-   <glyph horiz-adv-x="768" unicode="" d="M704 1216q0 -26 -19 -45t-45 -19h-128v-1024h128q26 0 45 -19t19 -45t-19 -45l-256 -256q-19 -19 -45 -19t-45 19l-256 256q-19 19 -19 45t19 45t45 19h128v1024h-128q-26 0 -45 19t-19 45t19 45l256 256q19 19 45 19t45 -19l256 -256q19 -19 19 -45z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 640q0 -26 -19 -45l-256 -256q-19 -19 -45 -19t-45 19t-19 45v128h-1024v-128q0 -26 -19 -45t-45 -19t-45 19l-256 256q-19 19 -19 45t19 45l256 256q19 19 45 19t45 -19t19 -45v-128h1024v128q0 26 19 45t45 19t45 -19l256 -256q19 -19 19 -45z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M640 640v-512h-256v512h256zM1024 1152v-1024h-256v1024h256zM2048 0v-128h-2048v1536h128v-1408h1920zM1408 896v-768h-256v768h256zM1792 1280v-1152h-256v1152h256z"/>
-   <glyph d="M1280 926q-56 -25 -121 -34q68 40 93 117q-65 -38 -134 -51q-61 66 -153 66q-87 0 -148.5 -61.5t-61.5 -148.5q0 -29 5 -48q-129 7 -242 65t-192 155q-29 -50 -29 -106q0 -114 91 -175q-47 1 -100 26v-2q0 -75 50 -133.5t123 -72.5q-29 -8 -51 -8q-13 0 -39 4 q21 -63 74.5 -104t121.5 -42q-116 -90 -261 -90q-26 0 -50 3q148 -94 322 -94q112 0 210 35.5t168 95t120.5 137t75 162t24.5 168.5q0 18 -1 27q63 45 105 109zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5 t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M1248 1408q119 0 203.5 -84.5t84.5 -203.5v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-188v595h199l30 232h-229v148q0 56 23.5 84t91.5 28l122 1v207q-63 9 -178 9q-136 0 -217.5 -80t-81.5 -226v-171h-200v-232h200v-595h-532q-119 0 -203.5 84.5t-84.5 203.5v960 q0 119 84.5 203.5t203.5 84.5h960z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M928 704q0 14 -9 23t-23 9q-66 0 -113 -47t-47 -113q0 -14 9 -23t23 -9t23 9t9 23q0 40 28 68t68 28q14 0 23 9t9 23zM1152 574q0 -106 -75 -181t-181 -75t-181 75t-75 181t75 181t181 75t181 -75t75 -181zM128 0h1536v128h-1536v-128zM1280 574q0 159 -112.5 271.5 t-271.5 112.5t-271.5 -112.5t-112.5 -271.5t112.5 -271.5t271.5 -112.5t271.5 112.5t112.5 271.5zM256 1216h384v128h-384v-128zM128 1024h1536v118v138h-828l-64 -128h-644v-128zM1792 1280v-1280q0 -53 -37.5 -90.5t-90.5 -37.5h-1536q-53 0 -90.5 37.5t-37.5 90.5v1280 q0 53 37.5 90.5t90.5 37.5h1536q53 0 90.5 -37.5t37.5 -90.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M832 1024q0 80 -56 136t-136 56t-136 -56t-56 -136q0 -42 19 -83q-41 19 -83 19q-80 0 -136 -56t-56 -136t56 -136t136 -56t136 56t56 136q0 42 -19 83q41 -19 83 -19q80 0 136 56t56 136zM1683 320q0 -17 -49 -66t-66 -49q-9 0 -28.5 16t-36.5 33t-38.5 40t-24.5 26 l-96 -96l220 -220q28 -28 28 -68q0 -42 -39 -81t-81 -39q-40 0 -68 28l-671 671q-176 -131 -365 -131q-163 0 -265.5 102.5t-102.5 265.5q0 160 95 313t248 248t313 95q163 0 265.5 -102.5t102.5 -265.5q0 -189 -131 -365l355 -355l96 96q-3 3 -26 24.5t-40 38.5t-33 36.5 t-16 28.5q0 17 49 66t66 49q13 0 23 -10q6 -6 46 -44.5t82 -79.5t86.5 -86t73 -78t28.5 -41z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M896 640q0 106 -75 181t-181 75t-181 -75t-75 -181t75 -181t181 -75t181 75t75 181zM1664 128q0 52 -38 90t-90 38t-90 -38t-38 -90q0 -53 37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1664 1152q0 52 -38 90t-90 38t-90 -38t-38 -90q0 -53 37.5 -90.5t90.5 -37.5 t90.5 37.5t37.5 90.5zM1280 731v-185q0 -10 -7 -19.5t-16 -10.5l-155 -24q-11 -35 -32 -76q34 -48 90 -115q7 -10 7 -20q0 -12 -7 -19q-23 -30 -82.5 -89.5t-78.5 -59.5q-11 0 -21 7l-115 90q-37 -19 -77 -31q-11 -108 -23 -155q-7 -24 -30 -24h-186q-11 0 -20 7.5t-10 17.5 l-23 153q-34 10 -75 31l-118 -89q-7 -7 -20 -7q-11 0 -21 8q-144 133 -144 160q0 9 7 19q10 14 41 53t47 61q-23 44 -35 82l-152 24q-10 1 -17 9.5t-7 19.5v185q0 10 7 19.5t16 10.5l155 24q11 35 32 76q-34 48 -90 115q-7 11 -7 20q0 12 7 20q22 30 82 89t79 59q11 0 21 -7 l115 -90q34 18 77 32q11 108 23 154q7 24 30 24h186q11 0 20 -7.5t10 -17.5l23 -153q34 -10 75 -31l118 89q8 7 20 7q11 0 21 -8q144 -133 144 -160q0 -9 -7 -19q-12 -16 -42 -54t-45 -60q23 -48 34 -82l152 -23q10 -2 17 -10.5t7 -19.5zM1920 198v-140q0 -16 -149 -31 q-12 -27 -30 -52q51 -113 51 -138q0 -4 -4 -7q-122 -71 -124 -71q-8 0 -46 47t-52 68q-20 -2 -30 -2t-30 2q-14 -21 -52 -68t-46 -47q-2 0 -124 71q-4 3 -4 7q0 25 51 138q-18 25 -30 52q-149 15 -149 31v140q0 16 149 31q13 29 30 52q-51 113 -51 138q0 4 4 7q4 2 35 20 t59 34t30 16q8 0 46 -46.5t52 -67.5q20 2 30 2t30 -2q51 71 92 112l6 2q4 0 124 -70q4 -3 4 -7q0 -25 -51 -138q17 -23 30 -52q149 -15 149 -31zM1920 1222v-140q0 -16 -149 -31q-12 -27 -30 -52q51 -113 51 -138q0 -4 -4 -7q-122 -71 -124 -71q-8 0 -46 47t-52 68 q-20 -2 -30 -2t-30 2q-14 -21 -52 -68t-46 -47q-2 0 -124 71q-4 3 -4 7q0 25 51 138q-18 25 -30 52q-149 15 -149 31v140q0 16 149 31q13 29 30 52q-51 113 -51 138q0 4 4 7q4 2 35 20t59 34t30 16q8 0 46 -46.5t52 -67.5q20 2 30 2t30 -2q51 71 92 112l6 2q4 0 124 -70 q4 -3 4 -7q0 -25 -51 -138q17 -23 30 -52q149 -15 149 -31z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1408 768q0 -139 -94 -257t-256.5 -186.5t-353.5 -68.5q-86 0 -176 16q-124 -88 -278 -128q-36 -9 -86 -16h-3q-11 0 -20.5 8t-11.5 21q-1 3 -1 6.5t0.5 6.5t2 6l2.5 5t3.5 5.5t4 5t4.5 5t4 4.5q5 6 23 25t26 29.5t22.5 29t25 38.5t20.5 44q-124 72 -195 177t-71 224 q0 139 94 257t256.5 186.5t353.5 68.5t353.5 -68.5t256.5 -186.5t94 -257zM1792 512q0 -120 -71 -224.5t-195 -176.5q10 -24 20.5 -44t25 -38.5t22.5 -29t26 -29.5t23 -25q1 -1 4 -4.5t4.5 -5t4 -5t3.5 -5.5l2.5 -5t2 -6t0.5 -6.5t-1 -6.5q-3 -14 -13 -22t-22 -7 q-50 7 -86 16q-154 40 -278 128q-90 -16 -176 -16q-271 0 -472 132q58 -4 88 -4q161 0 309 45t264 129q125 92 192 212t67 254q0 77 -23 152q129 -71 204 -178t75 -230z"/>
-   <glyph d="M256 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 768q0 51 -39 89.5t-89 38.5h-352q0 58 48 159.5t48 160.5q0 98 -32 145t-128 47q-26 -26 -38 -85t-30.5 -125.5t-59.5 -109.5q-22 -23 -77 -91q-4 -5 -23 -30t-31.5 -41t-34.5 -42.5 t-40 -44t-38.5 -35.5t-40 -27t-35.5 -9h-32v-640h32q13 0 31.5 -3t33 -6.5t38 -11t35 -11.5t35.5 -12.5t29 -10.5q211 -73 342 -73h121q192 0 192 167q0 26 -5 56q30 16 47.5 52.5t17.5 73.5t-18 69q53 50 53 119q0 25 -10 55.5t-25 47.5q32 1 53.5 47t21.5 81zM1536 769 q0 -89 -49 -163q9 -33 9 -69q0 -77 -38 -144q3 -21 3 -43q0 -101 -60 -178q1 -139 -85 -219.5t-227 -80.5h-36h-93q-96 0 -189.5 22.5t-216.5 65.5q-116 40 -138 40h-288q-53 0 -90.5 37.5t-37.5 90.5v640q0 53 37.5 90.5t90.5 37.5h274q36 24 137 155q58 75 107 128 q24 25 35.5 85.5t30.5 126.5t62 108q39 37 90 37q84 0 151 -32.5t102 -101.5t35 -186q0 -93 -48 -192h176q104 0 180 -76t76 -179z" unicode=""/>
-   <glyph d="M256 1088q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 512q0 35 -21.5 81t-53.5 47q15 17 25 47.5t10 55.5q0 69 -53 119q18 32 18 69t-17.5 73.5t-47.5 52.5q5 30 5 56q0 85 -49 126t-136 41h-128q-131 0 -342 -73q-5 -2 -29 -10.5 t-35.5 -12.5t-35 -11.5t-38 -11t-33 -6.5t-31.5 -3h-32v-640h32q16 0 35.5 -9t40 -27t38.5 -35.5t40 -44t34.5 -42.5t31.5 -41t23 -30q55 -68 77 -91q41 -43 59.5 -109.5t30.5 -125.5t38 -85q96 0 128 47t32 145q0 59 -48 160.5t-48 159.5h352q50 0 89 38.5t39 89.5z M1536 511q0 -103 -76 -179t-180 -76h-176q48 -99 48 -192q0 -118 -35 -186q-35 -69 -102 -101.5t-151 -32.5q-51 0 -90 37q-34 33 -54 82t-25.5 90.5t-17.5 84.5t-31 64q-48 50 -107 127q-101 131 -137 155h-274q-53 0 -90.5 37.5t-37.5 90.5v640q0 53 37.5 90.5t90.5 37.5 h288q22 0 138 40q128 44 223 66t200 22h112q140 0 226.5 -79t85.5 -216v-5q60 -77 60 -178q0 -22 -3 -43q38 -67 38 -144q0 -36 -9 -69q49 -74 49 -163z" unicode=""/>
-   <glyph horiz-adv-x="896" unicode="" d="M832 1504v-1339l-449 -236q-22 -12 -40 -12q-21 0 -31.5 14.5t-10.5 35.5q0 6 2 20l86 500l-364 354q-25 27 -25 48q0 37 56 46l502 73l225 455q19 41 49 41z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1664 940q0 81 -21.5 143t-55 98.5t-81.5 59.5t-94 31t-98 8t-112 -25.5t-110.5 -64t-86.5 -72t-60 -61.5q-18 -22 -49 -22t-49 22q-24 28 -60 61.5t-86.5 72t-110.5 64t-112 25.5t-98 -8t-94 -31t-81.5 -59.5t-55 -98.5t-21.5 -143q0 -168 187 -355l581 -560l580 559 q188 188 188 356zM1792 940q0 -221 -229 -450l-623 -600q-18 -18 -44 -18t-44 18l-624 602q-10 8 -27.5 26t-55.5 65.5t-68 97.5t-53.5 121t-23.5 138q0 220 127 344t351 124q62 0 126.5 -21.5t120 -58t95.5 -68.5t76 -68q36 36 76 68t95.5 68.5t120 58t126.5 21.5 q224 0 351 -124t127 -344z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M640 96q0 -4 1 -20t0.5 -26.5t-3 -23.5t-10 -19.5t-20.5 -6.5h-320q-119 0 -203.5 84.5t-84.5 203.5v704q0 119 84.5 203.5t203.5 84.5h320q13 0 22.5 -9.5t9.5 -22.5q0 -4 1 -20t0.5 -26.5t-3 -23.5t-10 -19.5t-20.5 -6.5h-320q-66 0 -113 -47t-47 -113v-704 q0 -66 47 -113t113 -47h288h11h13t11.5 -1t11.5 -3t8 -5.5t7 -9t2 -13.5zM1568 640q0 -26 -19 -45l-544 -544q-19 -19 -45 -19t-45 19t-19 45v288h-448q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h448v288q0 26 19 45t45 19t45 -19l544 -544q19 -19 19 -45z"/>
-   <glyph d="M237 122h231v694h-231v-694zM483 1030q-1 52 -36 86t-93 34t-94.5 -34t-36.5 -86q0 -51 35.5 -85.5t92.5 -34.5h1q59 0 95 34.5t36 85.5zM1068 122h231v398q0 154 -73 233t-193 79q-136 0 -209 -117h2v101h-231q3 -66 0 -694h231v388q0 38 7 56q15 35 45 59.5t74 24.5 q116 0 116 -157v-371zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1152" unicode="" d="M480 672v448q0 14 -9 23t-23 9t-23 -9t-9 -23v-448q0 -14 9 -23t23 -9t23 9t9 23zM1152 320q0 -26 -19 -45t-45 -19h-429l-51 -483q-2 -12 -10.5 -20.5t-20.5 -8.5h-1q-27 0 -32 27l-76 485h-404q-26 0 -45 19t-19 45q0 123 78.5 221.5t177.5 98.5v512q-52 0 -90 38 t-38 90t38 90t90 38h640q52 0 90 -38t38 -90t-38 -90t-90 -38v-512q99 0 177.5 -98.5t78.5 -221.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1408 608v-320q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h704q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-704q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832q66 0 113 47t47 113v320 q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM1792 1472v-512q0 -26 -19 -45t-45 -19t-45 19l-176 176l-652 -652q-10 -10 -23 -10t-23 10l-114 114q-10 10 -10 23t10 23l652 652l-176 176q-19 19 -19 45t19 45t45 19h512q26 0 45 -19t19 -45z"/>
-   <glyph d="M1184 640q0 -26 -19 -45l-544 -544q-19 -19 -45 -19t-45 19t-19 45v288h-448q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h448v288q0 26 19 45t45 19t45 -19l544 -544q19 -19 19 -45zM1536 992v-704q0 -119 -84.5 -203.5t-203.5 -84.5h-320q-13 0 -22.5 9.5t-9.5 22.5 q0 4 -1 20t-0.5 26.5t3 23.5t10 19.5t20.5 6.5h320q66 0 113 47t47 113v704q0 66 -47 113t-113 47h-288h-11h-13t-11.5 1t-11.5 3t-8 5.5t-7 9t-2 13.5q0 4 -1 20t-0.5 26.5t3 23.5t10 19.5t20.5 6.5h320q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M458 653q-74 162 -74 371h-256v-96q0 -78 94.5 -162t235.5 -113zM1536 928v96h-256q0 -209 -74 -371q141 29 235.5 113t94.5 162zM1664 1056v-128q0 -71 -41.5 -143t-112 -130t-173 -97.5t-215.5 -44.5q-42 -54 -95 -95q-38 -34 -52.5 -72.5t-14.5 -89.5q0 -54 30.5 -91 t97.5 -37q75 0 133.5 -45.5t58.5 -114.5v-64q0 -14 -9 -23t-23 -9h-832q-14 0 -23 9t-9 23v64q0 69 58.5 114.5t133.5 45.5q67 0 97.5 37t30.5 91q0 51 -14.5 89.5t-52.5 72.5q-53 41 -95 95q-113 5 -215.5 44.5t-173 97.5t-112 130t-41.5 143v128q0 40 28 68t68 28h288v96 q0 66 47 113t113 47h576q66 0 113 -47t47 -113v-96h288q40 0 68 -28t28 -68z"/>
-   <glyph d="M394 184q-8 -9 -20 3q-13 11 -4 19q8 9 20 -3q12 -11 4 -19zM352 245q9 -12 0 -19q-8 -6 -17 7t0 18q9 7 17 -6zM291 305q-5 -7 -13 -2q-10 5 -7 12q3 5 13 2q10 -5 7 -12zM322 271q-6 -7 -16 3q-9 11 -2 16q6 6 16 -3q9 -11 2 -16zM451 159q-4 -12 -19 -6q-17 4 -13 15 t19 7q16 -5 13 -16zM514 154q0 -11 -16 -11q-17 -2 -17 11q0 11 16 11q17 2 17 -11zM572 164q2 -10 -14 -14t-18 8t14 15q16 2 18 -9zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-224q-16 0 -24.5 1t-19.5 5t-16 14.5t-5 27.5v239q0 97 -52 142q57 6 102.5 18t94 39 t81 66.5t53 105t20.5 150.5q0 121 -79 206q37 91 -8 204q-28 9 -81 -11t-92 -44l-38 -24q-93 26 -192 26t-192 -26q-16 11 -42.5 27t-83.5 38.5t-86 13.5q-44 -113 -7 -204q-79 -85 -79 -206q0 -85 20.5 -150t52.5 -105t80.5 -67t94 -39t102.5 -18q-40 -36 -49 -103 q-21 -10 -45 -15t-57 -5t-65.5 21.5t-55.5 62.5q-19 32 -48.5 52t-49.5 24l-20 3q-21 0 -29 -4.5t-5 -11.5t9 -14t13 -12l7 -5q22 -10 43.5 -38t31.5 -51l10 -23q13 -38 44 -61.5t67 -30t69.5 -7t55.5 3.5l23 4q0 -38 0.5 -103t0.5 -68q0 -22 -11 -33.5t-22 -13t-33 -1.5 h-224q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1280 64q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1536 64q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1664 288v-320q0 -40 -28 -68t-68 -28h-1472q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h427q21 -56 70.5 -92 t110.5 -36h256q61 0 110.5 36t70.5 92h427q40 0 68 -28t28 -68zM1339 936q-17 -40 -59 -40h-256v-448q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v448h-256q-42 0 -59 40q-17 39 14 69l448 448q18 19 45 19t45 -19l448 -448q31 -30 14 -69z"/>
-   <glyph d="M1407 710q0 44 -7 113.5t-18 96.5q-12 30 -17 44t-9 36.5t-4 48.5q0 23 5 68.5t5 67.5q0 37 -10 55q-4 1 -13 1q-19 0 -58 -4.5t-59 -4.5q-60 0 -176 24t-175 24q-43 0 -94.5 -11.5t-85 -23.5t-89.5 -34q-137 -54 -202 -103q-96 -73 -159.5 -189.5t-88 -236t-24.5 -248.5 q0 -40 12.5 -120t12.5 -121q0 -23 -11 -66.5t-11 -65.5t12 -36.5t34 -14.5q24 0 72.5 11t73.5 11q57 0 169.5 -15.5t169.5 -15.5q181 0 284 36q129 45 235.5 152.5t166 245.5t59.5 275zM1535 712q0 -165 -70 -327.5t-196 -288t-281 -180.5q-124 -44 -326 -44 q-57 0 -170 14.5t-169 14.5q-24 0 -72.5 -14.5t-73.5 -14.5q-73 0 -123.5 55.5t-50.5 128.5q0 24 11 68t11 67q0 40 -12.5 120.5t-12.5 121.5q0 111 18 217.5t54.5 209.5t100.5 194t150 156q78 59 232 120q194 78 316 78q60 0 175.5 -24t173.5 -24q19 0 57 5t58 5 q81 0 118 -50.5t37 -134.5q0 -23 -5 -68t-5 -68q0 -10 1 -18.5t3 -17t4 -13.5t6.5 -16t6.5 -17q16 -40 25 -118.5t9 -136.5z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1408 296q0 -27 -10 -70.5t-21 -68.5q-21 -50 -122 -106q-94 -51 -186 -51q-27 0 -52.5 3.5t-57.5 12.5t-47.5 14.5t-55.5 20.5t-49 18q-98 35 -175 83q-128 79 -264.5 215.5t-215.5 264.5q-48 77 -83 175q-3 9 -18 49t-20.5 55.5t-14.5 47.5t-12.5 57.5t-3.5 52.5 q0 92 51 186q56 101 106 122q25 11 68.5 21t70.5 10q14 0 21 -3q18 -6 53 -76q11 -19 30 -54t35 -63.5t31 -53.5q3 -4 17.5 -25t21.5 -35.5t7 -28.5q0 -20 -28.5 -50t-62 -55t-62 -53t-28.5 -46q0 -9 5 -22.5t8.5 -20.5t14 -24t11.5 -19q76 -137 174 -235t235 -174 q2 -1 19 -11.5t24 -14t20.5 -8.5t22.5 -5q18 0 46 28.5t53 62t55 62t50 28.5q14 0 28.5 -7t35.5 -21.5t25 -17.5q25 -15 53.5 -31t63.5 -35t54 -30q70 -35 76 -53q3 -7 3 -21z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1120 1280h-832q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832q66 0 113 47t47 113v832q0 66 -47 113t-113 47zM1408 1120v-832q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h832 q119 0 203.5 -84.5t84.5 -203.5z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1152 1280h-1024v-1242l423 406l89 85l89 -85l423 -406v1242zM1164 1408q23 0 44 -9q33 -13 52.5 -41t19.5 -62v-1289q0 -34 -19.5 -62t-52.5 -41q-19 -8 -44 -8q-48 0 -83 32l-441 424l-441 -424q-36 -33 -83 -33q-23 0 -44 9q-33 13 -52.5 41t-19.5 62v1289 q0 34 19.5 62t52.5 41q21 9 44 9h1048z"/>
-   <glyph d="M1280 343q0 11 -2 16q-3 8 -38.5 29.5t-88.5 49.5l-53 29q-5 3 -19 13t-25 15t-21 5q-18 0 -47 -32.5t-57 -65.5t-44 -33q-7 0 -16.5 3.5t-15.5 6.5t-17 9.5t-14 8.5q-99 55 -170.5 126.5t-126.5 170.5q-2 3 -8.5 14t-9.5 17t-6.5 15.5t-3.5 16.5q0 13 20.5 33.5t45 38.5 t45 39.5t20.5 36.5q0 10 -5 21t-15 25t-13 19q-3 6 -15 28.5t-25 45.5t-26.5 47.5t-25 40.5t-16.5 18t-16 2q-48 0 -101 -22q-46 -21 -80 -94.5t-34 -130.5q0 -16 2.5 -34t5 -30.5t9 -33t10 -29.5t12.5 -33t11 -30q60 -164 216.5 -320.5t320.5 -216.5q6 -2 30 -11t33 -12.5 t29.5 -10t33 -9t30.5 -5t34 -2.5q57 0 130.5 34t94.5 80q22 53 22 101zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1620 1128q-67 -98 -162 -167q1 -14 1 -42q0 -130 -38 -259.5t-115.5 -248.5t-184.5 -210.5t-258 -146t-323 -54.5q-271 0 -496 145q35 -4 78 -4q225 0 401 138q-105 2 -188 64.5t-114 159.5q33 -5 61 -5q43 0 85 11q-112 23 -185.5 111.5t-73.5 205.5v4q68 -38 146 -41 q-66 44 -105 115t-39 154q0 88 44 163q121 -149 294.5 -238.5t371.5 -99.5q-8 38 -8 74q0 134 94.5 228.5t228.5 94.5q140 0 236 -102q109 21 205 78q-37 -115 -142 -178q93 10 186 50z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M959 1524v-264h-157q-86 0 -116 -36t-30 -108v-189h293l-39 -296h-254v-759h-306v759h-255v296h255v218q0 186 104 288.5t277 102.5q147 0 228 -12z"/>
-   <glyph d="M1536 640q0 -251 -146.5 -451.5t-378.5 -277.5q-27 -5 -39.5 7t-12.5 30v211q0 97 -52 142q57 6 102.5 18t94 39t81 66.5t53 105t20.5 150.5q0 121 -79 206q37 91 -8 204q-28 9 -81 -11t-92 -44l-38 -24q-93 26 -192 26t-192 -26q-16 11 -42.5 27t-83.5 38.5t-86 13.5 q-44 -113 -7 -204q-79 -85 -79 -206q0 -85 20.5 -150t52.5 -105t80.5 -67t94 -39t102.5 -18q-40 -36 -49 -103q-21 -10 -45 -15t-57 -5t-65.5 21.5t-55.5 62.5q-19 32 -48.5 52t-49.5 24l-20 3q-21 0 -29 -4.5t-5 -11.5t9 -14t13 -12l7 -5q22 -10 43.5 -38t31.5 -51l10 -23 q13 -38 44 -61.5t67 -30t69.5 -7t55.5 3.5l23 4q0 -38 0.5 -89t0.5 -54q0 -18 -13 -30t-40 -7q-232 77 -378.5 277.5t-146.5 451.5q0 209 103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1664 960v-256q0 -26 -19 -45t-45 -19h-64q-26 0 -45 19t-19 45v256q0 106 -75 181t-181 75t-181 -75t-75 -181v-192h96q40 0 68 -28t28 -68v-576q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v576q0 40 28 68t68 28h672v192q0 185 131.5 316.5t316.5 131.5 t316.5 -131.5t131.5 -316.5z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1760 1408q66 0 113 -47t47 -113v-1216q0 -66 -47 -113t-113 -47h-1600q-66 0 -113 47t-47 113v1216q0 66 47 113t113 47h1600zM160 1280q-13 0 -22.5 -9.5t-9.5 -22.5v-224h1664v224q0 13 -9.5 22.5t-22.5 9.5h-1600zM1760 0q13 0 22.5 9.5t9.5 22.5v608h-1664v-608 q0 -13 9.5 -22.5t22.5 -9.5h1600zM256 128v128h256v-128h-256zM640 128v128h384v-128h-384z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M384 192q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM896 69q2 -28 -17 -48q-18 -21 -47 -21h-135q-25 0 -43 16.5t-20 41.5q-22 229 -184.5 391.5t-391.5 184.5q-25 2 -41.5 20t-16.5 43v135q0 29 21 47q17 17 43 17h5q160 -13 306 -80.5 t259 -181.5q114 -113 181.5 -259t80.5 -306zM1408 67q2 -27 -18 -47q-18 -20 -46 -20h-143q-26 0 -44.5 17.5t-19.5 42.5q-12 215 -101 408.5t-231.5 336t-336 231.5t-408.5 102q-25 1 -42.5 19.5t-17.5 43.5v143q0 28 20 46q18 18 44 18h3q262 -13 501.5 -120t425.5 -294 q187 -186 294 -425.5t120 -501.5z"/>
-   <glyph d="M1040 320q0 -33 -23.5 -56.5t-56.5 -23.5t-56.5 23.5t-23.5 56.5t23.5 56.5t56.5 23.5t56.5 -23.5t23.5 -56.5zM1296 320q0 -33 -23.5 -56.5t-56.5 -23.5t-56.5 23.5t-23.5 56.5t23.5 56.5t56.5 23.5t56.5 -23.5t23.5 -56.5zM1408 160v320q0 13 -9.5 22.5t-22.5 9.5 h-1216q-13 0 -22.5 -9.5t-9.5 -22.5v-320q0 -13 9.5 -22.5t22.5 -9.5h1216q13 0 22.5 9.5t9.5 22.5zM178 640h1180l-157 482q-4 13 -16 21.5t-26 8.5h-782q-14 0 -26 -8.5t-16 -21.5zM1536 480v-320q0 -66 -47 -113t-113 -47h-1216q-66 0 -113 47t-47 113v320q0 25 16 75 l197 606q17 53 63 86t101 33h782q55 0 101 -33t63 -86l197 -606q16 -50 16 -75z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1664 896q53 0 90.5 -37.5t37.5 -90.5t-37.5 -90.5t-90.5 -37.5v-384q0 -52 -38 -90t-90 -38q-417 347 -812 380q-58 -19 -91 -66t-31 -100.5t40 -92.5q-20 -33 -23 -65.5t6 -58t33.5 -55t48 -50t61.5 -50.5q-29 -58 -111.5 -83t-168.5 -11.5t-132 55.5q-7 23 -29.5 87.5 t-32 94.5t-23 89t-15 101t3.5 98.5t22 110.5h-122q-66 0 -113 47t-47 113v192q0 66 47 113t113 47h480q435 0 896 384q52 0 90 -38t38 -90v-384zM1536 292v954q-394 -302 -768 -343v-270q377 -42 768 -341z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M912 -160q0 16 -16 16q-59 0 -101.5 42.5t-42.5 101.5q0 16 -16 16t-16 -16q0 -73 51.5 -124.5t124.5 -51.5q16 0 16 16zM246 128h1300q-266 300 -266 832q0 51 -24 105t-69 103t-121.5 80.5t-169.5 31.5t-169.5 -31.5t-121.5 -80.5t-69 -103t-24 -105q0 -532 -266 -832z M1728 128q0 -52 -38 -90t-90 -38h-448q0 -106 -75 -181t-181 -75t-181 75t-75 181h-448q-52 0 -90 38t-38 90q50 42 91 88t85 119.5t74.5 158.5t50 206t19.5 260q0 152 117 282.5t307 158.5q-8 19 -8 39q0 40 28 68t68 28t68 -28t28 -68q0 -20 -8 -39q190 -28 307 -158.5 t117 -282.5q0 -139 19.5 -260t50 -206t74.5 -158.5t85 -119.5t91 -88z"/>
-   <glyph d="M1376 640l138 -135q30 -28 20 -70q-12 -41 -52 -51l-188 -48l53 -186q12 -41 -19 -70q-29 -31 -70 -19l-186 53l-48 -188q-10 -40 -51 -52q-12 -2 -19 -2q-31 0 -51 22l-135 138l-135 -138q-28 -30 -70 -20q-41 11 -51 52l-48 188l-186 -53q-41 -12 -70 19q-31 29 -19 70 l53 186l-188 48q-40 10 -52 51q-10 42 20 70l138 135l-138 135q-30 28 -20 70q12 41 52 51l188 48l-53 186q-12 41 19 70q29 31 70 19l186 -53l48 188q10 41 51 51q41 12 70 -19l135 -139l135 139q29 30 70 19q41 -10 51 -51l48 -188l186 53q41 12 70 -19q31 -29 19 -70 l-53 -186l188 -48q40 -10 52 -51q10 -42 -20 -70z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M256 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1664 768q0 51 -39 89.5t-89 38.5h-576q0 20 15 48.5t33 55t33 68t15 84.5q0 67 -44.5 97.5t-115.5 30.5q-24 0 -90 -139q-24 -44 -37 -65q-40 -64 -112 -145q-71 -81 -101 -106 q-69 -57 -140 -57h-32v-640h32q72 0 167 -32t193.5 -64t179.5 -32q189 0 189 167q0 26 -5 56q30 16 47.5 52.5t17.5 73.5t-18 69q53 50 53 119q0 25 -10 55.5t-25 47.5h331q52 0 90 38t38 90zM1792 769q0 -105 -75.5 -181t-180.5 -76h-169q-4 -62 -37 -119q3 -21 3 -43 q0 -101 -60 -178q1 -139 -85 -219.5t-227 -80.5q-133 0 -322 69q-164 59 -223 59h-288q-53 0 -90.5 37.5t-37.5 90.5v640q0 53 37.5 90.5t90.5 37.5h288q10 0 21.5 4.5t23.5 14t22.5 18t24 22.5t20.5 21.5t19 21.5t14 17q65 74 100 129q13 21 33 62t37 72t40.5 63t55 49.5 t69.5 17.5q125 0 206.5 -67t81.5 -189q0 -68 -22 -128h374q104 0 180 -76t76 -179z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1376 128h32v640h-32q-35 0 -67.5 12t-62.5 37t-50 46t-49 54q-2 3 -3.5 4.5t-4 4.5t-4.5 5q-72 81 -112 145q-14 22 -38 68q-1 3 -10.5 22.5t-18.5 36t-20 35.5t-21.5 30.5t-18.5 11.5q-71 0 -115.5 -30.5t-44.5 -97.5q0 -43 15 -84.5t33 -68t33 -55t15 -48.5h-576 q-50 0 -89 -38.5t-39 -89.5q0 -52 38 -90t90 -38h331q-15 -17 -25 -47.5t-10 -55.5q0 -69 53 -119q-18 -32 -18 -69t17.5 -73.5t47.5 -52.5q-4 -24 -4 -56q0 -85 48.5 -126t135.5 -41q84 0 183 32t194 64t167 32zM1664 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45 t45 -19t45 19t19 45zM1792 768v-640q0 -53 -37.5 -90.5t-90.5 -37.5h-288q-59 0 -223 -59q-190 -69 -317 -69q-142 0 -230 77.5t-87 217.5l1 5q-61 76 -61 178q0 22 3 43q-33 57 -37 119h-169q-105 0 -180.5 76t-75.5 181q0 103 76 179t180 76h374q-22 60 -22 128 q0 122 81.5 189t206.5 67q38 0 69.5 -17.5t55 -49.5t40.5 -63t37 -72t33 -62q35 -55 100 -129q2 -3 14 -17t19 -21.5t20.5 -21.5t24 -22.5t22.5 -18t23.5 -14t21.5 -4.5h288q53 0 90.5 -37.5t37.5 -90.5z"/>
-   <glyph d="M1280 -64q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 700q0 189 -167 189q-26 0 -56 -5q-16 30 -52.5 47.5t-73.5 17.5t-69 -18q-50 53 -119 53q-25 0 -55.5 -10t-47.5 -25v331q0 52 -38 90t-90 38q-51 0 -89.5 -39t-38.5 -89v-576 q-20 0 -48.5 15t-55 33t-68 33t-84.5 15q-67 0 -97.5 -44.5t-30.5 -115.5q0 -24 139 -90q44 -24 65 -37q64 -40 145 -112q81 -71 106 -101q57 -69 57 -140v-32h640v32q0 72 32 167t64 193.5t32 179.5zM1536 705q0 -133 -69 -322q-59 -164 -59 -223v-288q0 -53 -37.5 -90.5 t-90.5 -37.5h-640q-53 0 -90.5 37.5t-37.5 90.5v288q0 10 -4.5 21.5t-14 23.5t-18 22.5t-22.5 24t-21.5 20.5t-21.5 19t-17 14q-74 65 -129 100q-21 13 -62 33t-72 37t-63 40.5t-49.5 55t-17.5 69.5q0 125 67 206.5t189 81.5q68 0 128 -22v374q0 104 76 180t179 76 q105 0 181 -75.5t76 -180.5v-169q62 -4 119 -37q21 3 43 3q101 0 178 -60q139 1 219.5 -85t80.5 -227z" unicode=""/>
-   <glyph d="M1408 576q0 84 -32 183t-64 194t-32 167v32h-640v-32q0 -35 -12 -67.5t-37 -62.5t-46 -50t-54 -49q-9 -8 -14 -12q-81 -72 -145 -112q-22 -14 -68 -38q-3 -1 -22.5 -10.5t-36 -18.5t-35.5 -20t-30.5 -21.5t-11.5 -18.5q0 -71 30.5 -115.5t97.5 -44.5q43 0 84.5 15t68 33 t55 33t48.5 15v-576q0 -50 38.5 -89t89.5 -39q52 0 90 38t38 90v331q46 -35 103 -35q69 0 119 53q32 -18 69 -18t73.5 17.5t52.5 47.5q24 -4 56 -4q85 0 126 48.5t41 135.5zM1280 1344q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1536 580 q0 -142 -77.5 -230t-217.5 -87l-5 1q-76 -61 -178 -61q-22 0 -43 3q-54 -30 -119 -37v-169q0 -105 -76 -180.5t-181 -75.5q-103 0 -179 76t-76 180v374q-54 -22 -128 -22q-121 0 -188.5 81.5t-67.5 206.5q0 38 17.5 69.5t49.5 55t63 40.5t72 37t62 33q55 35 129 100 q3 2 17 14t21.5 19t21.5 20.5t22.5 24t18 22.5t14 23.5t4.5 21.5v288q0 53 37.5 90.5t90.5 37.5h640q53 0 90.5 -37.5t37.5 -90.5v-288q0 -59 59 -223q69 -190 69 -317z" unicode=""/>
-   <glyph d="M1280 576v128q0 26 -19 45t-45 19h-502l189 189q19 19 19 45t-19 45l-91 91q-18 18 -45 18t-45 -18l-362 -362l-91 -91q-18 -18 -18 -45t18 -45l91 -91l362 -362q18 -18 45 -18t45 18l91 91q18 18 18 45t-18 45l-189 189h502q26 0 45 19t19 45zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1285 640q0 27 -18 45l-91 91l-362 362q-18 18 -45 18t-45 -18l-91 -91q-18 -18 -18 -45t18 -45l189 -189h-502q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h502l-189 -189q-19 -19 -19 -45t19 -45l91 -91q18 -18 45 -18t45 18l362 362l91 91q18 18 18 45zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1284 641q0 27 -18 45l-362 362l-91 91q-18 18 -45 18t-45 -18l-91 -91l-362 -362q-18 -18 -18 -45t18 -45l91 -91q18 -18 45 -18t45 18l189 189v-502q0 -26 19 -45t45 -19h128q26 0 45 19t19 45v502l189 -189q19 -19 45 -19t45 19l91 91q18 18 18 45zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1284 639q0 27 -18 45l-91 91q-18 18 -45 18t-45 -18l-189 -189v502q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-502l-189 189q-19 19 -45 19t-45 -19l-91 -91q-18 -18 -18 -45t18 -45l362 -362l91 -91q18 -18 45 -18t45 18l91 91l362 362q18 18 18 45zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M768 1408q209 0 385.5 -103t279.5 -279.5t103 -385.5t-103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103zM1042 887q-2 -1 -9.5 -9.5t-13.5 -9.5q2 0 4.5 5t5 11t3.5 7q6 7 22 15q14 6 52 12q34 8 51 -11 q-2 2 9.5 13t14.5 12q3 2 15 4.5t15 7.5l2 22q-12 -1 -17.5 7t-6.5 21q0 -2 -6 -8q0 7 -4.5 8t-11.5 -1t-9 -1q-10 3 -15 7.5t-8 16.5t-4 15q-2 5 -9.5 10.5t-9.5 10.5q-1 2 -2.5 5.5t-3 6.5t-4 5.5t-5.5 2.5t-7 -5t-7.5 -10t-4.5 -5q-3 2 -6 1.5t-4.5 -1t-4.5 -3t-5 -3.5 q-3 -2 -8.5 -3t-8.5 -2q15 5 -1 11q-10 4 -16 3q9 4 7.5 12t-8.5 14h5q-1 4 -8.5 8.5t-17.5 8.5t-13 6q-8 5 -34 9.5t-33 0.5q-5 -6 -4.5 -10.5t4 -14t3.5 -12.5q1 -6 -5.5 -13t-6.5 -12q0 -7 14 -15.5t10 -21.5q-3 -8 -16 -16t-16 -12q-5 -8 -1.5 -18.5t10.5 -16.5 q2 -2 1.5 -4t-3.5 -4.5t-5.5 -4t-6.5 -3.5l-3 -2q-11 -5 -20.5 6t-13.5 26q-7 25 -16 30q-23 8 -29 -1q-5 13 -41 26q-25 9 -58 4q6 1 0 15q-7 15 -19 12q3 6 4 17.5t1 13.5q3 13 12 23q1 1 7 8.5t9.5 13.5t0.5 6q35 -4 50 11q5 5 11.5 17t10.5 17q9 6 14 5.5t14.5 -5.5 t14.5 -5q14 -1 15.5 11t-7.5 20q12 -1 3 17q-5 7 -8 9q-12 4 -27 -5q-8 -4 2 -8q-1 1 -9.5 -10.5t-16.5 -17.5t-16 5q-1 1 -5.5 13.5t-9.5 13.5q-8 0 -16 -15q3 8 -11 15t-24 8q19 12 -8 27q-7 4 -20.5 5t-19.5 -4q-5 -7 -5.5 -11.5t5 -8t10.5 -5.5t11.5 -4t8.5 -3 q14 -10 8 -14q-2 -1 -8.5 -3.5t-11.5 -4.5t-6 -4q-3 -4 0 -14t-2 -14q-5 5 -9 17.5t-7 16.5q7 -9 -25 -6l-10 1q-4 0 -16 -2t-20.5 -1t-13.5 8q-4 8 0 20q1 4 4 2q-4 3 -11 9.5t-10 8.5q-46 -15 -94 -41q6 -1 12 1q5 2 13 6.5t10 5.5q34 14 42 7l5 5q14 -16 20 -25 q-7 4 -30 1q-20 -6 -22 -12q7 -12 5 -18q-4 3 -11.5 10t-14.5 11t-15 5q-16 0 -22 -1q-146 -80 -235 -222q7 -7 12 -8q4 -1 5 -9t2.5 -11t11.5 3q9 -8 3 -19q1 1 44 -27q19 -17 21 -21q3 -11 -10 -18q-1 2 -9 9t-9 4q-3 -5 0.5 -18.5t10.5 -12.5q-7 0 -9.5 -16t-2.5 -35.5 t-1 -23.5l2 -1q-3 -12 5.5 -34.5t21.5 -19.5q-13 -3 20 -43q6 -8 8 -9q3 -2 12 -7.5t15 -10t10 -10.5q4 -5 10 -22.5t14 -23.5q-2 -6 9.5 -20t10.5 -23q-1 0 -2.5 -1t-2.5 -1q3 -7 15.5 -14t15.5 -13q1 -3 2 -10t3 -11t8 -2q2 20 -24 62q-15 25 -17 29q-3 5 -5.5 15.5 t-4.5 14.5q2 0 6 -1.5t8.5 -3.5t7.5 -4t2 -3q-3 -7 2 -17.5t12 -18.5t17 -19t12 -13q6 -6 14 -19.5t0 -13.5q9 0 20 -10t17 -20q5 -8 8 -26t5 -24q2 -7 8.5 -13.5t12.5 -9.5l16 -8t13 -7q5 -2 18.5 -10.5t21.5 -11.5q10 -4 16 -4t14.5 2.5t13.5 3.5q15 2 29 -15t21 -21 q36 -19 55 -11q-2 -1 0.5 -7.5t8 -15.5t9 -14.5t5.5 -8.5q5 -6 18 -15t18 -15q6 4 7 9q-3 -8 7 -20t18 -10q14 3 14 32q-31 -15 -49 18q0 1 -2.5 5.5t-4 8.5t-2.5 8.5t0 7.5t5 3q9 0 10 3.5t-2 12.5t-4 13q-1 8 -11 20t-12 15q-5 -9 -16 -8t-16 9q0 -1 -1.5 -5.5t-1.5 -6.5 q-13 0 -15 1q1 3 2.5 17.5t3.5 22.5q1 4 5.5 12t7.5 14.5t4 12.5t-4.5 9.5t-17.5 2.5q-19 -1 -26 -20q-1 -3 -3 -10.5t-5 -11.5t-9 -7q-7 -3 -24 -2t-24 5q-13 8 -22.5 29t-9.5 37q0 10 2.5 26.5t3 25t-5.5 24.5q3 2 9 9.5t10 10.5q2 1 4.5 1.5t4.5 0t4 1.5t3 6q-1 1 -4 3 q-3 3 -4 3q7 -3 28.5 1.5t27.5 -1.5q15 -11 22 2q0 1 -2.5 9.5t-0.5 13.5q5 -27 29 -9q3 -3 15.5 -5t17.5 -5q3 -2 7 -5.5t5.5 -4.5t5 0.5t8.5 6.5q10 -14 12 -24q11 -40 19 -44q7 -3 11 -2t4.5 9.5t0 14t-1.5 12.5l-1 8v18l-1 8q-15 3 -18.5 12t1.5 18.5t15 18.5q1 1 8 3.5 t15.5 6.5t12.5 8q21 19 15 35q7 0 11 9q-1 0 -5 3t-7.5 5t-4.5 2q9 5 2 16q5 3 7.5 11t7.5 10q9 -12 21 -2q7 8 1 16q5 7 20.5 10.5t18.5 9.5q7 -2 8 2t1 12t3 12q4 5 15 9t13 5l17 11q3 4 0 4q18 -2 31 11q10 11 -6 20q3 6 -3 9.5t-15 5.5q3 1 11.5 0.5t10.5 1.5 q15 10 -7 16q-17 5 -43 -12zM879 10q206 36 351 189q-3 3 -12.5 4.5t-12.5 3.5q-18 7 -24 8q1 7 -2.5 13t-8 9t-12.5 8t-11 7q-2 2 -7 6t-7 5.5t-7.5 4.5t-8.5 2t-10 -1l-3 -1q-3 -1 -5.5 -2.5t-5.5 -3t-4 -3t0 -2.5q-21 17 -36 22q-5 1 -11 5.5t-10.5 7t-10 1.5t-11.5 -7 q-5 -5 -6 -15t-2 -13q-7 5 0 17.5t2 18.5q-3 6 -10.5 4.5t-12 -4.5t-11.5 -8.5t-9 -6.5t-8.5 -5.5t-8.5 -7.5q-3 -4 -6 -12t-5 -11q-2 4 -11.5 6.5t-9.5 5.5q2 -10 4 -35t5 -38q7 -31 -12 -48q-27 -25 -29 -40q-4 -22 12 -26q0 -7 -8 -20.5t-7 -21.5q0 -6 2 -16z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M384 64q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1028 484l-682 -682q-37 -37 -90 -37q-52 0 -91 37l-106 108q-38 36 -38 90q0 53 38 91l681 681q39 -98 114.5 -173.5t173.5 -114.5zM1662 919q0 -39 -23 -106q-47 -134 -164.5 -217.5 t-258.5 -83.5q-185 0 -316.5 131.5t-131.5 316.5t131.5 316.5t316.5 131.5q58 0 121.5 -16.5t107.5 -46.5q16 -11 16 -28t-16 -28l-293 -169v-224l193 -107q5 3 79 48.5t135.5 81t70.5 35.5q15 0 23.5 -10t8.5 -25z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1024 128h640v128h-640v-128zM640 640h1024v128h-1024v-128zM1280 1152h384v128h-384v-128zM1792 320v-256q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 832v-256q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19 t-19 45v256q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 1344v-256q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h1664q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1403 1241q17 -41 -14 -70l-493 -493v-742q0 -42 -39 -59q-13 -5 -25 -5q-27 0 -45 19l-256 256q-19 19 -19 45v486l-493 493q-31 29 -14 70q17 39 59 39h1280q42 0 59 -39z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M640 1280h512v128h-512v-128zM1792 640v-480q0 -66 -47 -113t-113 -47h-1472q-66 0 -113 47t-47 113v480h672v-160q0 -26 19 -45t45 -19h320q26 0 45 19t19 45v160h672zM1024 640v-128h-256v128h256zM1792 1120v-384h-1792v384q0 66 47 113t113 47h352v160q0 40 28 68 t68 28h576q40 0 68 -28t28 -68v-160h352q66 0 113 -47t47 -113z"/>
-   <glyph d="M1283 995l-355 -355l355 -355l144 144q29 31 70 14q39 -17 39 -59v-448q0 -26 -19 -45t-45 -19h-448q-42 0 -59 40q-17 39 14 69l144 144l-355 355l-355 -355l144 -144q31 -30 14 -69q-17 -40 -59 -40h-448q-26 0 -45 19t-19 45v448q0 42 40 59q39 17 69 -14l144 -144 l355 355l-355 355l-144 -144q-19 -19 -45 -19q-12 0 -24 5q-40 17 -40 59v448q0 26 19 45t45 19h448q42 0 59 -40q17 -39 -14 -69l-144 -144l355 -355l355 355l-144 144q-31 30 -14 69q17 40 59 40h448q26 0 45 -19t19 -45v-448q0 -42 -39 -59q-13 -5 -25 -5q-26 0 -45 19z " unicode=""/>
-   <glyph horiz-adv-x="1920" unicode="" d="M593 640q-162 -5 -265 -128h-134q-82 0 -138 40.5t-56 118.5q0 353 124 353q6 0 43.5 -21t97.5 -42.5t119 -21.5q67 0 133 23q-5 -37 -5 -66q0 -139 81 -256zM1664 3q0 -120 -73 -189.5t-194 -69.5h-874q-121 0 -194 69.5t-73 189.5q0 53 3.5 103.5t14 109t26.5 108.5 t43 97.5t62 81t85.5 53.5t111.5 20q10 0 43 -21.5t73 -48t107 -48t135 -21.5t135 21.5t107 48t73 48t43 21.5q61 0 111.5 -20t85.5 -53.5t62 -81t43 -97.5t26.5 -108.5t14 -109t3.5 -103.5zM640 1280q0 -106 -75 -181t-181 -75t-181 75t-75 181t75 181t181 75t181 -75 t75 -181zM1344 896q0 -159 -112.5 -271.5t-271.5 -112.5t-271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5t271.5 -112.5t112.5 -271.5zM1920 671q0 -78 -56 -118.5t-138 -40.5h-134q-103 123 -265 128q81 117 81 256q0 29 -5 66q66 -23 133 -23q59 0 119 21.5t97.5 42.5 t43.5 21q124 0 124 -353zM1792 1280q0 -106 -75 -181t-181 -75t-181 75t-75 181t75 181t181 75t181 -75t75 -181z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1456 320q0 40 -28 68l-208 208q-28 28 -68 28q-42 0 -72 -32q3 -3 19 -18.5t21.5 -21.5t15 -19t13 -25.5t3.5 -27.5q0 -40 -28 -68t-68 -28q-15 0 -27.5 3.5t-25.5 13t-19 15t-21.5 21.5t-18.5 19q-33 -31 -33 -73q0 -40 28 -68l206 -207q27 -27 68 -27q40 0 68 26 l147 146q28 28 28 67zM753 1025q0 40 -28 68l-206 207q-28 28 -68 28q-39 0 -68 -27l-147 -146q-28 -28 -28 -67q0 -40 28 -68l208 -208q27 -27 68 -27q42 0 72 31q-3 3 -19 18.5t-21.5 21.5t-15 19t-13 25.5t-3.5 27.5q0 40 28 68t68 28q15 0 27.5 -3.5t25.5 -13t19 -15 t21.5 -21.5t18.5 -19q33 31 33 73zM1648 320q0 -120 -85 -203l-147 -146q-83 -83 -203 -83q-121 0 -204 85l-206 207q-83 83 -83 203q0 123 88 209l-88 88q-86 -88 -208 -88q-120 0 -204 84l-208 208q-84 84 -84 204t85 203l147 146q83 83 203 83q121 0 204 -85l206 -207 q83 -83 83 -203q0 -123 -88 -209l88 -88q86 88 208 88q120 0 204 -84l208 -208q84 -84 84 -204z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1920 384q0 -159 -112.5 -271.5t-271.5 -112.5h-1088q-185 0 -316.5 131.5t-131.5 316.5q0 132 71 241.5t187 163.5q-2 28 -2 43q0 212 150 362t362 150q158 0 286.5 -88t187.5 -230q70 62 166 62q106 0 181 -75t75 -181q0 -75 -41 -138q129 -30 213 -134.5t84 -239.5z "/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1527 88q56 -89 21.5 -152.5t-140.5 -63.5h-1152q-106 0 -140.5 63.5t21.5 152.5l503 793v399h-64q-26 0 -45 19t-19 45t19 45t45 19h512q26 0 45 -19t19 -45t-19 -45t-45 -19h-64v-399zM748 813l-272 -429h712l-272 429l-20 31v37v399h-128v-399v-37z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M960 640q26 0 45 -19t19 -45t-19 -45t-45 -19t-45 19t-19 45t19 45t45 19zM1260 576l507 -398q28 -20 25 -56q-5 -35 -35 -51l-128 -64q-13 -7 -29 -7q-17 0 -31 8l-690 387l-110 -66q-8 -4 -12 -5q14 -49 10 -97q-7 -77 -56 -147.5t-132 -123.5q-132 -84 -277 -84 q-136 0 -222 78q-90 84 -79 207q7 76 56 147t131 124q132 84 278 84q83 0 151 -31q9 13 22 22l122 73l-122 73q-13 9 -22 22q-68 -31 -151 -31q-146 0 -278 84q-82 53 -131 124t-56 147q-5 59 15.5 113t63.5 93q85 79 222 79q145 0 277 -84q83 -52 132 -123t56 -148 q4 -48 -10 -97q4 -1 12 -5l110 -66l690 387q14 8 31 8q16 0 29 -7l128 -64q30 -16 35 -51q3 -36 -25 -56zM579 836q46 42 21 108t-106 117q-92 59 -192 59q-74 0 -113 -36q-46 -42 -21 -108t106 -117q92 -59 192 -59q74 0 113 36zM494 91q81 51 106 117t-21 108 q-39 36 -113 36q-100 0 -192 -59q-81 -51 -106 -117t21 -108q39 -36 113 -36q100 0 192 59zM672 704l96 -58v11q0 36 33 56l14 8l-79 47l-26 -26q-3 -3 -10 -11t-12 -12q-2 -2 -4 -3.5t-3 -2.5zM896 480l96 -32l736 576l-128 64l-768 -431v-113l-160 -96l9 -8q2 -2 7 -6 q4 -4 11 -12t11 -12l26 -26zM1600 64l128 64l-520 408l-177 -138q-2 -3 -13 -7z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1696 1152q40 0 68 -28t28 -68v-1216q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v288h-544q-40 0 -68 28t-28 68v672q0 40 20 88t48 76l408 408q28 28 76 48t88 20h416q40 0 68 -28t28 -68v-328q68 40 128 40h416zM1152 939l-299 -299h299v299zM512 1323l-299 -299 h299v299zM708 676l316 316v416h-384v-416q0 -40 -28 -68t-68 -28h-416v-640h512v256q0 40 20 88t48 76zM1664 -128v1152h-384v-416q0 -40 -28 -68t-68 -28h-416v-640h896z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1404 151q0 -117 -79 -196t-196 -79q-135 0 -235 100l-777 776q-113 115 -113 271q0 159 110 270t269 111q158 0 273 -113l605 -606q10 -10 10 -22q0 -16 -30.5 -46.5t-46.5 -30.5q-13 0 -23 10l-606 607q-79 77 -181 77q-106 0 -179 -75t-73 -181q0 -105 76 -181 l776 -777q63 -63 145 -63q64 0 106 42t42 106q0 82 -63 145l-581 581q-26 24 -60 24q-29 0 -48 -19t-19 -48q0 -32 25 -59l410 -410q10 -10 10 -22q0 -16 -31 -47t-47 -31q-12 0 -22 10l-410 410q-63 61 -63 149q0 82 57 139t139 57q88 0 149 -63l581 -581q100 -98 100 -235 z"/>
-   <glyph d="M384 0h768v384h-768v-384zM1280 0h128v896q0 14 -10 38.5t-20 34.5l-281 281q-10 10 -34 20t-39 10v-416q0 -40 -28 -68t-68 -28h-576q-40 0 -68 28t-28 68v416h-128v-1280h128v416q0 40 28 68t68 28h832q40 0 68 -28t28 -68v-416zM896 928v320q0 13 -9.5 22.5t-22.5 9.5 h-192q-13 0 -22.5 -9.5t-9.5 -22.5v-320q0 -13 9.5 -22.5t22.5 -9.5h192q13 0 22.5 9.5t9.5 22.5zM1536 896v-928q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1344q0 40 28 68t68 28h928q40 0 88 -20t76 -48l280 -280q28 -28 48 -76t20 -88z" unicode=""/>
-   <glyph d="M1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M1536 192v-128q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1408q26 0 45 -19t19 -45zM1536 704v-128q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1408q26 0 45 -19t19 -45zM1536 1216v-128q0 -26 -19 -45 t-45 -19h-1408q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1408q26 0 45 -19t19 -45z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M384 128q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM384 640q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM1792 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5 t22.5 9.5h1216q13 0 22.5 -9.5t9.5 -22.5zM384 1152q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM1792 736v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1216q13 0 22.5 -9.5t9.5 -22.5z M1792 1248v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1216q13 0 22.5 -9.5t9.5 -22.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M381 -84q0 -80 -54.5 -126t-135.5 -46q-106 0 -172 66l57 88q49 -45 106 -45q29 0 50.5 14.5t21.5 42.5q0 64 -105 56l-26 56q8 10 32.5 43.5t42.5 54t37 38.5v1q-16 0 -48.5 -1t-48.5 -1v-53h-106v152h333v-88l-95 -115q51 -12 81 -49t30 -88zM383 543v-159h-362 q-6 36 -6 54q0 51 23.5 93t56.5 68t66 47.5t56.5 43.5t23.5 45q0 25 -14.5 38.5t-39.5 13.5q-46 0 -81 -58l-85 59q24 51 71.5 79.5t105.5 28.5q73 0 123 -41.5t50 -112.5q0 -50 -34 -91.5t-75 -64.5t-75.5 -50.5t-35.5 -52.5h127v60h105zM1792 224v-192q0 -13 -9.5 -22.5 t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 14 9 23t23 9h1216q13 0 22.5 -9.5t9.5 -22.5zM384 1123v-99h-335v99h107q0 41 0.5 122t0.5 121v12h-2q-8 -17 -50 -54l-71 76l136 127h106v-404h108zM1792 736v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5 t-9.5 22.5v192q0 14 9 23t23 9h1216q13 0 22.5 -9.5t9.5 -22.5zM1792 1248v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1216q13 0 22.5 -9.5t9.5 -22.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1760 640q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-1728q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h1728zM483 704q-28 35 -51 80q-48 97 -48 188q0 181 134 309q133 127 393 127q50 0 167 -19q66 -12 177 -48q10 -38 21 -118q14 -123 14 -183q0 -18 -5 -45l-12 -3l-84 6 l-14 2q-50 149 -103 205q-88 91 -210 91q-114 0 -182 -59q-67 -58 -67 -146q0 -73 66 -140t279 -129q69 -20 173 -66q58 -28 95 -52h-743zM990 448h411q7 -39 7 -92q0 -111 -41 -212q-23 -55 -71 -104q-37 -35 -109 -81q-80 -48 -153 -66q-80 -21 -203 -21q-114 0 -195 23 l-140 40q-57 16 -72 28q-8 8 -8 22v13q0 108 -2 156q-1 30 0 68l2 37v44l102 2q15 -34 30 -71t22.5 -56t12.5 -27q35 -57 80 -94q43 -36 105 -57q59 -22 132 -22q64 0 139 27q77 26 122 86q47 61 47 129q0 84 -81 157q-34 29 -137 71z"/>
-   <glyph d="M48 1313q-37 2 -45 4l-3 88q13 1 40 1q60 0 112 -4q132 -7 166 -7q86 0 168 3q116 4 146 5q56 0 86 2l-1 -14l2 -64v-9q-60 -9 -124 -9q-60 0 -79 -25q-13 -14 -13 -132q0 -13 0.5 -32.5t0.5 -25.5l1 -229l14 -280q6 -124 51 -202q35 -59 96 -92q88 -47 177 -47 q104 0 191 28q56 18 99 51q48 36 65 64q36 56 53 114q21 73 21 229q0 79 -3.5 128t-11 122.5t-13.5 159.5l-4 59q-5 67 -24 88q-34 35 -77 34l-100 -2l-14 3l2 86h84l205 -10q76 -3 196 10l18 -2q6 -38 6 -51q0 -7 -4 -31q-45 -12 -84 -13q-73 -11 -79 -17q-15 -15 -15 -41 q0 -7 1.5 -27t1.5 -31q8 -19 22 -396q6 -195 -15 -304q-15 -76 -41 -122q-38 -65 -112 -123q-75 -57 -182 -89q-109 -33 -255 -33q-167 0 -284 46q-119 47 -179 122q-61 76 -83 195q-16 80 -16 237v333q0 188 -17 213q-25 36 -147 39zM1536 -96v64q0 14 -9 23t-23 9h-1472 q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h1472q14 0 23 9t9 23z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M512 160v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM512 544v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1024 160v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23 v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM512 928v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1024 544v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1536 160v192 q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1024 928v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1536 544v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192 q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1536 928v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1664 1248v-1088q0 -66 -47 -113t-113 -47h-1344q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1344q66 0 113 -47t47 -113 z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1190 955l293 293l-107 107l-293 -293zM1637 1248q0 -27 -18 -45l-1286 -1286q-18 -18 -45 -18t-45 18l-198 198q-18 18 -18 45t18 45l1286 1286q18 18 45 18t45 -18l198 -198q18 -18 18 -45zM286 1438l98 -30l-98 -30l-30 -98l-30 98l-98 30l98 30l30 98zM636 1276 l196 -60l-196 -60l-60 -196l-60 196l-196 60l196 60l60 196zM1566 798l98 -30l-98 -30l-30 -98l-30 98l-98 30l98 30l30 98zM926 1438l98 -30l-98 -30l-30 -98l-30 98l-98 30l98 30l30 98z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M640 128q0 52 -38 90t-90 38t-90 -38t-38 -90t38 -90t90 -38t90 38t38 90zM256 640h384v256h-158q-13 0 -22 -9l-195 -195q-9 -9 -9 -22v-30zM1536 128q0 52 -38 90t-90 38t-90 -38t-38 -90t38 -90t90 -38t90 38t38 90zM1792 1216v-1024q0 -15 -4 -26.5t-13.5 -18.5 t-16.5 -11.5t-23.5 -6t-22.5 -2t-25.5 0t-22.5 0.5q0 -106 -75 -181t-181 -75t-181 75t-75 181h-384q0 -106 -75 -181t-181 -75t-181 75t-75 181h-64q-3 0 -22.5 -0.5t-25.5 0t-22.5 2t-23.5 6t-16.5 11.5t-13.5 18.5t-4 26.5q0 26 19 45t45 19v320q0 8 -0.5 35t0 38 t2.5 34.5t6.5 37t14 30.5t22.5 30l198 198q19 19 50.5 32t58.5 13h160v192q0 26 19 45t45 19h1024q26 0 45 -19t19 -45z"/>
-   <glyph d="M1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103q-111 0 -218 32q59 93 78 164q9 34 54 211q20 -39 73 -67.5t114 -28.5q121 0 216 68.5t147 188.5t52 270q0 114 -59.5 214t-172.5 163t-255 63q-105 0 -196 -29t-154.5 -77t-109 -110.5t-67 -129.5t-21.5 -134 q0 -104 40 -183t117 -111q30 -12 38 20q2 7 8 31t8 30q6 23 -11 43q-51 61 -51 151q0 151 104.5 259.5t273.5 108.5q151 0 235.5 -82t84.5 -213q0 -170 -68.5 -289t-175.5 -119q-61 0 -98 43.5t-23 104.5q8 35 26.5 93.5t30 103t11.5 75.5q0 50 -27 83t-77 33 q-62 0 -105 -57t-43 -142q0 -73 25 -122l-99 -418q-17 -70 -13 -177q-206 91 -333 281t-127 423q0 209 103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1248 1408q119 0 203.5 -84.5t84.5 -203.5v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-725q85 122 108 210q9 34 53 209q21 -39 73.5 -67t112.5 -28q181 0 295.5 147.5t114.5 373.5q0 84 -35 162.5t-96.5 139t-152.5 97t-197 36.5q-104 0 -194.5 -28.5t-153 -76.5 t-107.5 -109.5t-66.5 -128t-21.5 -132.5q0 -102 39.5 -180t116.5 -110q13 -5 23.5 0t14.5 19q10 44 15 61q6 23 -11 42q-50 62 -50 150q0 150 103.5 256.5t270.5 106.5q149 0 232.5 -81t83.5 -210q0 -168 -67.5 -286t-173.5 -118q-60 0 -97 43.5t-23 103.5q8 34 26.5 92.5 t29.5 102t11 74.5q0 49 -26.5 81.5t-75.5 32.5q-61 0 -103.5 -56.5t-42.5 -139.5q0 -72 24 -121l-98 -414q-24 -100 -7 -254h-183q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960z" unicode=""/>
-   <glyph d="M917 631q0 26 -6 64h-362v-132h217q-3 -24 -16.5 -50t-37.5 -53t-66.5 -44.5t-96.5 -17.5q-99 0 -169 71t-70 171t70 171t169 71q92 0 153 -59l104 101q-108 100 -257 100q-160 0 -272 -112.5t-112 -271.5t112 -271.5t272 -112.5q165 0 266.5 105t101.5 270zM1262 585 h109v110h-109v110h-110v-110h-110v-110h110v-110h110v110zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1437 623q0 -208 -87 -370.5t-248 -254t-369 -91.5q-149 0 -285 58t-234 156t-156 234t-58 285t58 285t156 234t234 156t285 58q286 0 491 -192l-199 -191q-117 113 -292 113q-123 0 -227.5 -62t-165.5 -168.5t-61 -232.5t61 -232.5t165.5 -168.5t227.5 -62 q83 0 152.5 23t114.5 57.5t78.5 78.5t49 83t21.5 74h-416v252h692q12 -63 12 -122zM2304 745v-210h-209v-209h-210v209h-209v210h209v209h210v-209h209z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M768 384h384v96h-128v448h-114l-148 -137l77 -80q42 37 55 57h2v-288h-128v-96zM1280 640q0 -70 -21 -142t-59.5 -134t-101.5 -101t-138 -39t-138 39t-101.5 101t-59.5 134t-21 142t21 142t59.5 134t101.5 101t138 39t138 -39t101.5 -101t59.5 -134t21 -142zM1792 384 v512q-106 0 -181 75t-75 181h-1152q0 -106 -75 -181t-181 -75v-512q106 0 181 -75t75 -181h1152q0 106 75 181t181 75zM1920 1216v-1152q0 -26 -19 -45t-45 -19h-1792q-26 0 -45 19t-19 45v1152q0 26 19 45t45 19h1792q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M1024 832q0 -26 -19 -45l-448 -448q-19 -19 -45 -19t-45 19l-448 448q-19 19 -19 45t19 45t45 19h896q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M1024 320q0 -26 -19 -45t-45 -19h-896q-26 0 -45 19t-19 45t19 45l448 448q19 19 45 19t45 -19l448 -448q19 -19 19 -45z"/>
-   <glyph horiz-adv-x="640" unicode="" d="M640 1088v-896q0 -26 -19 -45t-45 -19t-45 19l-448 448q-19 19 -19 45t19 45l448 448q19 19 45 19t45 -19t19 -45z"/>
-   <glyph horiz-adv-x="640" unicode="" d="M576 640q0 -26 -19 -45l-448 -448q-19 -19 -45 -19t-45 19t-19 45v896q0 26 19 45t45 19t45 -19l448 -448q19 -19 19 -45z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M160 0h608v1152h-640v-1120q0 -13 9.5 -22.5t22.5 -9.5zM1536 32v1120h-640v-1152h608q13 0 22.5 9.5t9.5 22.5zM1664 1248v-1216q0 -66 -47 -113t-113 -47h-1344q-66 0 -113 47t-47 113v1216q0 66 47 113t113 47h1344q66 0 113 -47t47 -113z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M1024 448q0 -26 -19 -45l-448 -448q-19 -19 -45 -19t-45 19l-448 448q-19 19 -19 45t19 45t45 19h896q26 0 45 -19t19 -45zM1024 832q0 -26 -19 -45t-45 -19h-896q-26 0 -45 19t-19 45t19 45l448 448q19 19 45 19t45 -19l448 -448q19 -19 19 -45z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M1024 448q0 -26 -19 -45l-448 -448q-19 -19 -45 -19t-45 19l-448 448q-19 19 -19 45t19 45t45 19h896q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M1024 832q0 -26 -19 -45t-45 -19h-896q-26 0 -45 19t-19 45t19 45l448 448q19 19 45 19t45 -19l448 -448q19 -19 19 -45z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 826v-794q0 -66 -47 -113t-113 -47h-1472q-66 0 -113 47t-47 113v794q44 -49 101 -87q362 -246 497 -345q57 -42 92.5 -65.5t94.5 -48t110 -24.5h1h1q51 0 110 24.5t94.5 48t92.5 65.5q170 123 498 345q57 39 100 87zM1792 1120q0 -79 -49 -151t-122 -123 q-376 -261 -468 -325q-10 -7 -42.5 -30.5t-54 -38t-52 -32.5t-57.5 -27t-50 -9h-1h-1q-23 0 -50 9t-57.5 27t-52 32.5t-54 38t-42.5 30.5q-91 64 -262 182.5t-205 142.5q-62 42 -117 115.5t-55 136.5q0 78 41.5 130t118.5 52h1472q65 0 112.5 -47t47.5 -113z"/>
-   <glyph d="M349 911v-991h-330v991h330zM370 1217q1 -73 -50.5 -122t-135.5 -49h-2q-82 0 -132 49t-50 122q0 74 51.5 122.5t134.5 48.5t133 -48.5t51 -122.5zM1536 488v-568h-329v530q0 105 -40.5 164.5t-126.5 59.5q-63 0 -105.5 -34.5t-63.5 -85.5q-11 -30 -11 -81v-553h-329 q2 399 2 647t-1 296l-1 48h329v-144h-2q20 32 41 56t56.5 52t87 43.5t114.5 15.5q171 0 275 -113.5t104 -332.5z" unicode=""/>
-   <glyph d="M1536 640q0 -156 -61 -298t-164 -245t-245 -164t-298 -61q-172 0 -327 72.5t-264 204.5q-7 10 -6.5 22.5t8.5 20.5l137 138q10 9 25 9q16 -2 23 -12q73 -95 179 -147t225 -52q104 0 198.5 40.5t163.5 109.5t109.5 163.5t40.5 198.5t-40.5 198.5t-109.5 163.5 t-163.5 109.5t-198.5 40.5q-98 0 -188 -35.5t-160 -101.5l137 -138q31 -30 14 -69q-17 -40 -59 -40h-448q-26 0 -45 19t-19 45v448q0 42 40 59q39 17 69 -14l130 -129q107 101 244.5 156.5t284.5 55.5q156 0 298 -61t245 -164t164 -245t61 -298z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1771 0q0 -53 -37 -90l-107 -108q-39 -37 -91 -37q-53 0 -90 37l-363 364q-38 36 -38 90q0 53 43 96l-256 256l-126 -126q-14 -14 -34 -14t-34 14q2 -2 12.5 -12t12.5 -13t10 -11.5t10 -13.5t6 -13.5t5.5 -16.5t1.5 -18q0 -38 -28 -68q-3 -3 -16.5 -18t-19 -20.5 t-18.5 -16.5t-22 -15.5t-22 -9t-26 -4.5q-40 0 -68 28l-408 408q-28 28 -28 68q0 13 4.5 26t9 22t15.5 22t16.5 18.5t20.5 19t18 16.5q30 28 68 28q10 0 18 -1.5t16.5 -5.5t13.5 -6t13.5 -10t11.5 -10t13 -12.5t12 -12.5q-14 14 -14 34t14 34l348 348q14 14 34 14t34 -14 q-2 2 -12.5 12t-12.5 13t-10 11.5t-10 13.5t-6 13.5t-5.5 16.5t-1.5 18q0 38 28 68q3 3 16.5 18t19 20.5t18.5 16.5t22 15.5t22 9t26 4.5q40 0 68 -28l408 -408q28 -28 28 -68q0 -13 -4.5 -26t-9 -22t-15.5 -22t-16.5 -18.5t-20.5 -19t-18 -16.5q-30 -28 -68 -28 q-10 0 -18 1.5t-16.5 5.5t-13.5 6t-13.5 10t-11.5 10t-13 12.5t-12 12.5q14 -14 14 -34t-14 -34l-126 -126l256 -256q43 43 96 43q52 0 91 -37l363 -363q37 -39 37 -91z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M384 384q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM576 832q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1004 351l101 382q6 26 -7.5 48.5t-38.5 29.5 t-48 -6.5t-30 -39.5l-101 -382q-60 -5 -107 -43.5t-63 -98.5q-20 -77 20 -146t117 -89t146 20t89 117q16 60 -6 117t-72 91zM1664 384q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1024 1024q0 53 -37.5 90.5 t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1472 832q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1792 384q0 -261 -141 -483q-19 -29 -54 -29h-1402q-35 0 -54 29 q-141 221 -141 483q0 182 71 348t191 286t286 191t348 71t348 -71t286 -191t191 -286t71 -348z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M896 1152q-204 0 -381.5 -69.5t-282 -187.5t-104.5 -255q0 -112 71.5 -213.5t201.5 -175.5l87 -50l-27 -96q-24 -91 -70 -172q152 63 275 171l43 38l57 -6q69 -8 130 -8q204 0 381.5 69.5t282 187.5t104.5 255t-104.5 255t-282 187.5t-381.5 69.5zM1792 640 q0 -174 -120 -321.5t-326 -233t-450 -85.5q-70 0 -145 8q-198 -175 -460 -242q-49 -14 -114 -22h-5q-15 0 -27 10.5t-16 27.5v1q-3 4 -0.5 12t2 10t4.5 9.5l6 9t7 8.5t8 9q7 8 31 34.5t34.5 38t31 39.5t32.5 51t27 59t26 76q-157 89 -247.5 220t-90.5 281q0 174 120 321.5 t326 233t450 85.5t450 -85.5t326 -233t120 -321.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M704 1152q-153 0 -286 -52t-211.5 -141t-78.5 -191q0 -82 53 -158t149 -132l97 -56l-35 -84q34 20 62 39l44 31l53 -10q78 -14 153 -14q153 0 286 52t211.5 141t78.5 191t-78.5 191t-211.5 141t-286 52zM704 1280q191 0 353.5 -68.5t256.5 -186.5t94 -257t-94 -257 t-256.5 -186.5t-353.5 -68.5q-86 0 -176 16q-124 -88 -278 -128q-36 -9 -86 -16h-3q-11 0 -20.5 8t-11.5 21q-1 3 -1 6.5t0.5 6.5t2 6l2.5 5t3.5 5.5t4 5t4.5 5t4 4.5q5 6 23 25t26 29.5t22.5 29t25 38.5t20.5 44q-124 72 -195 177t-71 224q0 139 94 257t256.5 186.5 t353.5 68.5zM1526 111q10 -24 20.5 -44t25 -38.5t22.5 -29t26 -29.5t23 -25q1 -1 4 -4.5t4.5 -5t4 -5t3.5 -5.5l2.5 -5t2 -6t0.5 -6.5t-1 -6.5q-3 -14 -13 -22t-22 -7q-50 7 -86 16q-154 40 -278 128q-90 -16 -176 -16q-271 0 -472 132q58 -4 88 -4q161 0 309 45t264 129 q125 92 192 212t67 254q0 77 -23 152q129 -71 204 -178t75 -230q0 -120 -71 -224.5t-195 -176.5z"/>
-   <glyph horiz-adv-x="896" unicode="" d="M885 970q18 -20 7 -44l-540 -1157q-13 -25 -42 -25q-4 0 -14 2q-17 5 -25.5 19t-4.5 30l197 808l-406 -101q-4 -1 -12 -1q-18 0 -31 11q-18 15 -13 39l201 825q4 14 16 23t28 9h328q19 0 32 -12.5t13 -29.5q0 -8 -5 -18l-171 -463l396 98q8 2 12 2q19 0 34 -15z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 288v-320q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h96v192h-512v-192h96q40 0 68 -28t28 -68v-320q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h96v192h-512v-192h96q40 0 68 -28t28 -68v-320 q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h96v192q0 52 38 90t90 38h512v192h-96q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h320q40 0 68 -28t28 -68v-320q0 -40 -28 -68t-68 -28h-96v-192h512q52 0 90 -38t38 -90v-192h96q40 0 68 -28t28 -68 z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M896 708v-580q0 -104 -76 -180t-180 -76t-180 76t-76 180q0 26 19 45t45 19t45 -19t19 -45q0 -50 39 -89t89 -39t89 39t39 89v580q33 11 64 11t64 -11zM1664 681q0 -13 -9.5 -22.5t-22.5 -9.5q-11 0 -23 10q-49 46 -93 69t-102 23q-68 0 -128 -37t-103 -97 q-7 -10 -17.5 -28t-14.5 -24q-11 -17 -28 -17q-18 0 -29 17q-4 6 -14.5 24t-17.5 28q-43 60 -102.5 97t-127.5 37t-127.5 -37t-102.5 -97q-7 -10 -17.5 -28t-14.5 -24q-11 -17 -29 -17q-17 0 -28 17q-4 6 -14.5 24t-17.5 28q-43 60 -103 97t-128 37q-58 0 -102 -23t-93 -69 q-12 -10 -23 -10q-13 0 -22.5 9.5t-9.5 22.5q0 5 1 7q45 183 172.5 319.5t298 204.5t360.5 68q140 0 274.5 -40t246.5 -113.5t194.5 -187t115.5 -251.5q1 -2 1 -7zM896 1408v-98q-42 2 -64 2t-64 -2v98q0 26 19 45t45 19t45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M768 -128h896v640h-416q-40 0 -68 28t-28 68v416h-384v-1152zM1024 1312v64q0 13 -9.5 22.5t-22.5 9.5h-704q-13 0 -22.5 -9.5t-9.5 -22.5v-64q0 -13 9.5 -22.5t22.5 -9.5h704q13 0 22.5 9.5t9.5 22.5zM1280 640h299l-299 299v-299zM1792 512v-672q0 -40 -28 -68t-68 -28 h-960q-40 0 -68 28t-28 68v160h-544q-40 0 -68 28t-28 68v1344q0 40 28 68t68 28h1088q40 0 68 -28t28 -68v-328q21 -13 36 -28l408 -408q28 -28 48 -76t20 -88z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M736 960q0 -13 -9.5 -22.5t-22.5 -9.5t-22.5 9.5t-9.5 22.5q0 46 -54 71t-106 25q-13 0 -22.5 9.5t-9.5 22.5t9.5 22.5t22.5 9.5q50 0 99.5 -16t87 -54t37.5 -90zM896 960q0 72 -34.5 134t-90 101.5t-123 62t-136.5 22.5t-136.5 -22.5t-123 -62t-90 -101.5t-34.5 -134 q0 -101 68 -180q10 -11 30.5 -33t30.5 -33q128 -153 141 -298h228q13 145 141 298q10 11 30.5 33t30.5 33q68 79 68 180zM1024 960q0 -155 -103 -268q-45 -49 -74.5 -87t-59.5 -95.5t-34 -107.5q47 -28 47 -82q0 -37 -25 -64q25 -27 25 -64q0 -52 -45 -81q13 -23 13 -47 q0 -46 -31.5 -71t-77.5 -25q-20 -44 -60 -70t-87 -26t-87 26t-60 70q-46 0 -77.5 25t-31.5 71q0 24 13 47q-45 29 -45 81q0 37 25 64q-25 27 -25 64q0 54 47 82q-4 50 -34 107.5t-59.5 95.5t-74.5 87q-103 113 -103 268q0 99 44.5 184.5t117 142t164 89t186.5 32.5 t186.5 -32.5t164 -89t117 -142t44.5 -184.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 352v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5q-12 0 -24 10l-319 320q-9 9 -9 22q0 14 9 23l320 320q9 9 23 9q13 0 22.5 -9.5t9.5 -22.5v-192h1376q13 0 22.5 -9.5t9.5 -22.5zM1792 896q0 -14 -9 -23l-320 -320q-9 -9 -23 -9 q-13 0 -22.5 9.5t-9.5 22.5v192h-1376q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1376v192q0 14 9 23t23 9q12 0 24 -10l319 -319q9 -9 9 -23z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1280 608q0 14 -9 23t-23 9h-224v352q0 13 -9.5 22.5t-22.5 9.5h-192q-13 0 -22.5 -9.5t-9.5 -22.5v-352h-224q-13 0 -22.5 -9.5t-9.5 -22.5q0 -14 9 -23l352 -352q9 -9 23 -9t23 9l351 351q10 12 10 24zM1920 384q0 -159 -112.5 -271.5t-271.5 -112.5h-1088 q-185 0 -316.5 131.5t-131.5 316.5q0 130 70 240t188 165q-2 30 -2 43q0 212 150 362t362 150q156 0 285.5 -87t188.5 -231q71 62 166 62q106 0 181 -75t75 -181q0 -76 -41 -138q130 -31 213.5 -135.5t83.5 -238.5z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1280 672q0 14 -9 23l-352 352q-9 9 -23 9t-23 -9l-351 -351q-10 -12 -10 -24q0 -14 9 -23t23 -9h224v-352q0 -13 9.5 -22.5t22.5 -9.5h192q13 0 22.5 9.5t9.5 22.5v352h224q13 0 22.5 9.5t9.5 22.5zM1920 384q0 -159 -112.5 -271.5t-271.5 -112.5h-1088 q-185 0 -316.5 131.5t-131.5 316.5q0 130 70 240t188 165q-2 30 -2 43q0 212 150 362t362 150q156 0 285.5 -87t188.5 -231q71 62 166 62q106 0 181 -75t75 -181q0 -76 -41 -138q130 -31 213.5 -135.5t83.5 -238.5z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M384 192q0 -26 -19 -45t-45 -19t-45 19t-19 45t19 45t45 19t45 -19t19 -45zM1408 131q0 -121 -73 -190t-194 -69h-874q-121 0 -194 69t-73 190q0 68 5.5 131t24 138t47.5 132.5t81 103t120 60.5q-22 -52 -22 -120v-203q-58 -20 -93 -70t-35 -111q0 -80 56 -136t136 -56 t136 56t56 136q0 61 -35.5 111t-92.5 70v203q0 62 25 93q132 -104 295 -104t295 104q25 -31 25 -93v-64q-106 0 -181 -75t-75 -181v-89q-32 -29 -32 -71q0 -40 28 -68t68 -28t68 28t28 68q0 42 -32 71v89q0 52 38 90t90 38t90 -38t38 -90v-89q-32 -29 -32 -71q0 -40 28 -68 t68 -28t68 28t28 68q0 42 -32 71v89q0 68 -34.5 127.5t-93.5 93.5q0 10 0.5 42.5t0 48t-2.5 41.5t-7 47t-13 40q68 -15 120 -60.5t81 -103t47.5 -132.5t24 -138t5.5 -131zM1088 1024q0 -159 -112.5 -271.5t-271.5 -112.5t-271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5 t271.5 -112.5t112.5 -271.5z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1280 832q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 832q0 -62 -35.5 -111t-92.5 -70v-395q0 -159 -131.5 -271.5t-316.5 -112.5t-316.5 112.5t-131.5 271.5v132q-164 20 -274 128t-110 252v512q0 26 19 45t45 19q6 0 16 -2q17 30 47 48 t65 18q53 0 90.5 -37.5t37.5 -90.5t-37.5 -90.5t-90.5 -37.5q-33 0 -64 18v-402q0 -106 94 -181t226 -75t226 75t94 181v402q-31 -18 -64 -18q-53 0 -90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5q35 0 65 -18t47 -48q10 2 16 2q26 0 45 -19t19 -45v-512q0 -144 -110 -252 t-274 -128v-132q0 -106 94 -181t226 -75t226 75t94 181v395q-57 21 -92.5 70t-35.5 111q0 80 56 136t136 56t136 -56t56 -136z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M640 1152h512v128h-512v-128zM288 1152v-1280h-64q-92 0 -158 66t-66 158v832q0 92 66 158t158 66h64zM1408 1152v-1280h-1024v1280h128v160q0 40 28 68t68 28h576q40 0 68 -28t28 -68v-160h128zM1792 928v-832q0 -92 -66 -158t-158 -66h-64v1280h64q92 0 158 -66 t66 -158z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M912 -160q0 16 -16 16q-59 0 -101.5 42.5t-42.5 101.5q0 16 -16 16t-16 -16q0 -73 51.5 -124.5t124.5 -51.5q16 0 16 16zM1728 128q0 -52 -38 -90t-90 -38h-448q0 -106 -75 -181t-181 -75t-181 75t-75 181h-448q-52 0 -90 38t-38 90q50 42 91 88t85 119.5t74.5 158.5 t50 206t19.5 260q0 152 117 282.5t307 158.5q-8 19 -8 39q0 40 28 68t68 28t68 -28t28 -68q0 -20 -8 -39q190 -28 307 -158.5t117 -282.5q0 -139 19.5 -260t50 -206t74.5 -158.5t85 -119.5t91 -88z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1664 896q0 80 -56 136t-136 56h-64v-384h64q80 0 136 56t56 136zM0 128h1792q0 -106 -75 -181t-181 -75h-1280q-106 0 -181 75t-75 181zM1856 896q0 -159 -112.5 -271.5t-271.5 -112.5h-64v-32q0 -92 -66 -158t-158 -66h-704q-92 0 -158 66t-66 158v736q0 26 19 45 t45 19h1152q159 0 271.5 -112.5t112.5 -271.5z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M640 1472v-640q0 -61 -35.5 -111t-92.5 -70v-779q0 -52 -38 -90t-90 -38h-128q-52 0 -90 38t-38 90v779q-57 20 -92.5 70t-35.5 111v640q0 26 19 45t45 19t45 -19t19 -45v-416q0 -26 19 -45t45 -19t45 19t19 45v416q0 26 19 45t45 19t45 -19t19 -45v-416q0 -26 19 -45 t45 -19t45 19t19 45v416q0 26 19 45t45 19t45 -19t19 -45zM1408 1472v-1600q0 -52 -38 -90t-90 -38h-128q-52 0 -90 38t-38 90v512h-224q-13 0 -22.5 9.5t-9.5 22.5v800q0 132 94 226t226 94h256q26 0 45 -19t19 -45z"/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z M384 736q0 14 9 23t23 9h704q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-704q-14 0 -23 9t-9 23v64zM1120 512q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-704q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h704zM1120 256q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-704 q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h704z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M384 224v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M1152 224v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM896 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 992v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M1152 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM896 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 992v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 1248v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M1152 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM896 992v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 1248v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1152 992v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M896 1248v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1152 1248v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M896 -128h384v1536h-1152v-1536h384v224q0 13 9.5 22.5t22.5 9.5h320q13 0 22.5 -9.5t9.5 -22.5v-224zM1408 1472v-1664q0 -26 -19 -45t-45 -19h-1280q-26 0 -45 19t-19 45v1664q0 26 19 45t45 19h1280q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M384 224v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M1152 224v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM896 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1152 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M896 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1152 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M896 -128h384v1152h-256v-32q0 -40 -28 -68t-68 -28h-448q-40 0 -68 28t-28 68v32h-256v-1152h384v224q0 13 9.5 22.5t22.5 9.5h320q13 0 22.5 -9.5t9.5 -22.5v-224zM896 1056v320q0 13 -9.5 22.5t-22.5 9.5h-64q-13 0 -22.5 -9.5t-9.5 -22.5v-96h-128v96q0 13 -9.5 22.5 t-22.5 9.5h-64q-13 0 -22.5 -9.5t-9.5 -22.5v-320q0 -13 9.5 -22.5t22.5 -9.5h64q13 0 22.5 9.5t9.5 22.5v96h128v-96q0 -13 9.5 -22.5t22.5 -9.5h64q13 0 22.5 9.5t9.5 22.5zM1408 1088v-1280q0 -26 -19 -45t-45 -19h-1280q-26 0 -45 19t-19 45v1280q0 26 19 45t45 19h320 v288q0 40 28 68t68 28h448q40 0 68 -28t28 -68v-288h320q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M640 128q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM256 640h384v256h-158q-14 -2 -22 -9l-195 -195q-7 -12 -9 -22v-30zM1536 128q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5 t90.5 37.5t37.5 90.5zM1664 800v192q0 14 -9 23t-23 9h-224v224q0 14 -9 23t-23 9h-192q-14 0 -23 -9t-9 -23v-224h-224q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h224v-224q0 -14 9 -23t23 -9h192q14 0 23 9t9 23v224h224q14 0 23 9t9 23zM1920 1344v-1152 q0 -26 -19 -45t-45 -19h-192q0 -106 -75 -181t-181 -75t-181 75t-75 181h-384q0 -106 -75 -181t-181 -75t-181 75t-75 181h-128q-26 0 -45 19t-19 45t19 45t45 19v416q0 26 13 58t32 51l198 198q19 19 51 32t58 13h160v320q0 26 19 45t45 19h1152q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1280 416v192q0 14 -9 23t-23 9h-224v224q0 14 -9 23t-23 9h-192q-14 0 -23 -9t-9 -23v-224h-224q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h224v-224q0 -14 9 -23t23 -9h192q14 0 23 9t9 23v224h224q14 0 23 9t9 23zM640 1152h512v128h-512v-128zM256 1152v-1280h-32 q-92 0 -158 66t-66 158v832q0 92 66 158t158 66h32zM1440 1152v-1280h-1088v1280h160v160q0 40 28 68t68 28h576q40 0 68 -28t28 -68v-160h160zM1792 928v-832q0 -92 -66 -158t-158 -66h-32v1280h32q92 0 158 -66t66 -158z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1920 576q-1 -32 -288 -96l-352 -32l-224 -64h-64l-293 -352h69q26 0 45 -4.5t19 -11.5t-19 -11.5t-45 -4.5h-96h-160h-64v32h64v416h-160l-192 -224h-96l-32 32v192h32v32h128v8l-192 24v128l192 24v8h-128v32h-32v192l32 32h96l192 -224h160v416h-64v32h64h160h96 q26 0 45 -4.5t19 -11.5t-19 -11.5t-45 -4.5h-69l293 -352h64l224 -64l352 -32q261 -58 287 -93z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M640 640v384h-256v-256q0 -53 37.5 -90.5t90.5 -37.5h128zM1664 192v-192h-1152v192l128 192h-128q-159 0 -271.5 112.5t-112.5 271.5v320l-64 64l32 128h480l32 128h960l32 -192l-64 -32v-800z"/>
-   <glyph d="M1280 192v896q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-320h-512v320q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-896q0 -26 19 -45t45 -19h128q26 0 45 19t19 45v320h512v-320q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1536 1120v-960 q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M1280 576v128q0 26 -19 45t-45 19h-320v320q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-320h-320q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h320v-320q0 -26 19 -45t45 -19h128q26 0 45 19t19 45v320h320q26 0 45 19t19 45zM1536 1120v-960 q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1024" unicode="" d="M627 160q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l50 -50q10 -10 10 -23t-10 -23l-393 -393l393 -393q10 -10 10 -23zM1011 160q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23 t10 23l466 466q10 10 23 10t23 -10l50 -50q10 -10 10 -23t-10 -23l-393 -393l393 -393q10 -10 10 -23z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M595 576q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23zM979 576q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23 l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23z"/>
-   <glyph horiz-adv-x="1152" unicode="" d="M1075 224q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-393 393l-393 -393q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l466 -466q10 -10 10 -23zM1075 608q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-393 393l-393 -393 q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l466 -466q10 -10 10 -23z"/>
-   <glyph horiz-adv-x="1152" unicode="" d="M1075 672q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l393 -393l393 393q10 10 23 10t23 -10l50 -50q10 -10 10 -23zM1075 1056q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23 t10 23l50 50q10 10 23 10t23 -10l393 -393l393 393q10 10 23 10t23 -10l50 -50q10 -10 10 -23z"/>
-   <glyph horiz-adv-x="640" unicode="" d="M627 992q0 -13 -10 -23l-393 -393l393 -393q10 -10 10 -23t-10 -23l-50 -50q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l50 -50q10 -10 10 -23z"/>
-   <glyph horiz-adv-x="640" unicode="" d="M595 576q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23z"/>
-   <glyph horiz-adv-x="1152" unicode="" d="M1075 352q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-393 393l-393 -393q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l466 -466q10 -10 10 -23z"/>
-   <glyph horiz-adv-x="1152" unicode="" d="M1075 800q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l393 -393l393 393q10 10 23 10t23 -10l50 -50q10 -10 10 -23z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1792 544v832q0 13 -9.5 22.5t-22.5 9.5h-1600q-13 0 -22.5 -9.5t-9.5 -22.5v-832q0 -13 9.5 -22.5t22.5 -9.5h1600q13 0 22.5 9.5t9.5 22.5zM1920 1376v-1088q0 -66 -47 -113t-113 -47h-544q0 -37 16 -77.5t32 -71t16 -43.5q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19 t-19 45q0 14 16 44t32 70t16 78h-544q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1600q66 0 113 -47t47 -113z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M416 256q-66 0 -113 47t-47 113v704q0 66 47 113t113 47h1088q66 0 113 -47t47 -113v-704q0 -66 -47 -113t-113 -47h-1088zM384 1120v-704q0 -13 9.5 -22.5t22.5 -9.5h1088q13 0 22.5 9.5t9.5 22.5v704q0 13 -9.5 22.5t-22.5 9.5h-1088q-13 0 -22.5 -9.5t-9.5 -22.5z M1760 192h160v-96q0 -40 -47 -68t-113 -28h-1600q-66 0 -113 28t-47 68v96h160h1600zM1040 96q16 0 16 16t-16 16h-160q-16 0 -16 -16t16 -16h160z"/>
-   <glyph horiz-adv-x="1152" unicode="" d="M640 128q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1024 288v960q0 13 -9.5 22.5t-22.5 9.5h-832q-13 0 -22.5 -9.5t-9.5 -22.5v-960q0 -13 9.5 -22.5t22.5 -9.5h832q13 0 22.5 9.5t9.5 22.5zM1152 1248v-1088q0 -66 -47 -113t-113 -47h-832 q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h832q66 0 113 -47t47 -113z"/>
-   <glyph horiz-adv-x="768" unicode="" d="M464 128q0 33 -23.5 56.5t-56.5 23.5t-56.5 -23.5t-23.5 -56.5t23.5 -56.5t56.5 -23.5t56.5 23.5t23.5 56.5zM672 288v704q0 13 -9.5 22.5t-22.5 9.5h-512q-13 0 -22.5 -9.5t-9.5 -22.5v-704q0 -13 9.5 -22.5t22.5 -9.5h512q13 0 22.5 9.5t9.5 22.5zM480 1136 q0 16 -16 16h-160q-16 0 -16 -16t16 -16h160q16 0 16 16zM768 1152v-1024q0 -52 -38 -90t-90 -38h-512q-52 0 -90 38t-38 90v1024q0 52 38 90t90 38h512q52 0 90 -38t38 -90z"/>
-   <glyph d="M768 1184q-148 0 -273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273t-73 273t-198 198t-273 73zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103 t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M768 576v-384q0 -80 -56 -136t-136 -56h-384q-80 0 -136 56t-56 136v704q0 104 40.5 198.5t109.5 163.5t163.5 109.5t198.5 40.5h64q26 0 45 -19t19 -45v-128q0 -26 -19 -45t-45 -19h-64q-106 0 -181 -75t-75 -181v-32q0 -40 28 -68t68 -28h224q80 0 136 -56t56 -136z M1664 576v-384q0 -80 -56 -136t-136 -56h-384q-80 0 -136 56t-56 136v704q0 104 40.5 198.5t109.5 163.5t163.5 109.5t198.5 40.5h64q26 0 45 -19t19 -45v-128q0 -26 -19 -45t-45 -19h-64q-106 0 -181 -75t-75 -181v-32q0 -40 28 -68t68 -28h224q80 0 136 -56t56 -136z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M768 1216v-704q0 -104 -40.5 -198.5t-109.5 -163.5t-163.5 -109.5t-198.5 -40.5h-64q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h64q106 0 181 75t75 181v32q0 40 -28 68t-68 28h-224q-80 0 -136 56t-56 136v384q0 80 56 136t136 56h384q80 0 136 -56t56 -136zM1664 1216 v-704q0 -104 -40.5 -198.5t-109.5 -163.5t-163.5 -109.5t-198.5 -40.5h-64q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h64q106 0 181 75t75 181v32q0 40 -28 68t-68 28h-224q-80 0 -136 56t-56 136v384q0 80 56 136t136 56h384q80 0 136 -56t56 -136z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M526 142q0 -53 -37.5 -90.5t-90.5 -37.5q-52 0 -90 38t-38 90q0 53 37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1024 -64q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM320 640q0 -53 -37.5 -90.5t-90.5 -37.5 t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1522 142q0 -52 -38 -90t-90 -38q-53 0 -90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM558 1138q0 -66 -47 -113t-113 -47t-113 47t-47 113t47 113t113 47t113 -47t47 -113z M1728 640q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1088 1344q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM1618 1138q0 -93 -66 -158.5t-158 -65.5q-93 0 -158.5 65.5t-65.5 158.5 q0 92 65.5 158t158.5 66q92 0 158 -66t66 -158z"/>
-   <glyph d="M1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 416q0 -166 -127 -451q-3 -7 -10.5 -24t-13.5 -30t-13 -22q-12 -17 -28 -17q-15 0 -23.5 10t-8.5 25q0 9 2.5 26.5t2.5 23.5q5 68 5 123q0 101 -17.5 181t-48.5 138.5t-80 101t-105.5 69.5t-133 42.5t-154 21.5t-175.5 6h-224v-256q0 -26 -19 -45t-45 -19t-45 19 l-512 512q-19 19 -19 45t19 45l512 512q19 19 45 19t45 -19t19 -45v-256h224q713 0 875 -403q53 -134 53 -333z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M640 320q0 -40 -12.5 -82t-43 -76t-72.5 -34t-72.5 34t-43 76t-12.5 82t12.5 82t43 76t72.5 34t72.5 -34t43 -76t12.5 -82zM1280 320q0 -40 -12.5 -82t-43 -76t-72.5 -34t-72.5 34t-43 76t-12.5 82t12.5 82t43 76t72.5 34t72.5 -34t43 -76t12.5 -82zM1440 320 q0 120 -69 204t-187 84q-41 0 -195 -21q-71 -11 -157 -11t-157 11q-152 21 -195 21q-118 0 -187 -84t-69 -204q0 -88 32 -153.5t81 -103t122 -60t140 -29.5t149 -7h168q82 0 149 7t140 29.5t122 60t81 103t32 153.5zM1664 496q0 -207 -61 -331q-38 -77 -105.5 -133t-141 -86 t-170 -47.5t-171.5 -22t-167 -4.5q-78 0 -142 3t-147.5 12.5t-152.5 30t-137 51.5t-121 81t-86 115q-62 123 -62 331q0 237 136 396q-27 82 -27 170q0 116 51 218q108 0 190 -39.5t189 -123.5q147 35 309 35q148 0 280 -32q105 82 187 121t189 39q51 -102 51 -218 q0 -87 -27 -168q136 -160 136 -398z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1536 224v704q0 40 -28 68t-68 28h-704q-40 0 -68 28t-28 68v64q0 40 -28 68t-68 28h-320q-40 0 -68 -28t-28 -68v-960q0 -40 28 -68t68 -28h1216q40 0 68 28t28 68zM1664 928v-704q0 -92 -66 -158t-158 -66h-1216q-92 0 -158 66t-66 158v960q0 92 66 158t158 66h320 q92 0 158 -66t66 -158v-32h672q92 0 158 -66t66 -158z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1781 605q0 35 -53 35h-1088q-40 0 -85.5 -21.5t-71.5 -52.5l-294 -363q-18 -24 -18 -40q0 -35 53 -35h1088q40 0 86 22t71 53l294 363q18 22 18 39zM640 768h768v160q0 40 -28 68t-68 28h-576q-40 0 -68 28t-28 68v64q0 40 -28 68t-68 28h-320q-40 0 -68 -28t-28 -68 v-853l256 315q44 53 116 87.5t140 34.5zM1909 605q0 -62 -46 -120l-295 -363q-43 -53 -116 -87.5t-140 -34.5h-1088q-92 0 -158 66t-66 158v960q0 92 66 158t158 66h320q92 0 158 -66t66 -158v-32h544q92 0 158 -66t66 -158v-160h192q54 0 99 -24.5t67 -70.5q15 -32 15 -68z "/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph d="M1134 461q-37 -121 -138 -195t-228 -74t-228 74t-138 195q-8 25 4 48.5t38 31.5q25 8 48.5 -4t31.5 -38q25 -80 92.5 -129.5t151.5 -49.5t151.5 49.5t92.5 129.5q8 26 32 38t49 4t37 -31.5t4 -48.5zM640 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5 t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1152 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1408 640q0 130 -51 248.5t-136.5 204t-204 136.5t-248.5 51t-248.5 -51t-204 -136.5t-136.5 -204t-51 -248.5 t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1134 307q8 -25 -4 -48.5t-37 -31.5t-49 4t-32 38q-25 80 -92.5 129.5t-151.5 49.5t-151.5 -49.5t-92.5 -129.5q-8 -26 -31.5 -38t-48.5 -4q-26 8 -38 31.5t-4 48.5q37 121 138 195t228 74t228 -74t138 -195zM640 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5 t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1152 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1408 640q0 130 -51 248.5t-136.5 204t-204 136.5t-248.5 51t-248.5 -51t-204 -136.5t-136.5 -204 t-51 -248.5t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1152 448q0 -26 -19 -45t-45 -19h-640q-26 0 -45 19t-19 45t19 45t45 19h640q26 0 45 -19t19 -45zM640 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1152 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5 t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1408 640q0 130 -51 248.5t-136.5 204t-204 136.5t-248.5 51t-248.5 -51t-204 -136.5t-136.5 -204t-51 -248.5t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1920" unicode="" d="M832 448v128q0 14 -9 23t-23 9h-192v192q0 14 -9 23t-23 9h-128q-14 0 -23 -9t-9 -23v-192h-192q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h192v-192q0 -14 9 -23t23 -9h128q14 0 23 9t9 23v192h192q14 0 23 9t9 23zM1408 384q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5 t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1664 640q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1920 512q0 -212 -150 -362t-362 -150q-192 0 -338 128h-220q-146 -128 -338 -128q-212 0 -362 150 t-150 362t150 362t362 150h896q212 0 362 -150t150 -362z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M384 368v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM512 624v-96q0 -16 -16 -16h-224q-16 0 -16 16v96q0 16 16 16h224q16 0 16 -16zM384 880v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1408 368v-96q0 -16 -16 -16 h-864q-16 0 -16 16v96q0 16 16 16h864q16 0 16 -16zM768 624v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM640 880v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1024 624v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16 h96q16 0 16 -16zM896 880v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1280 624v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1664 368v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1152 880v-96 q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1408 880v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1664 880v-352q0 -16 -16 -16h-224q-16 0 -16 16v96q0 16 16 16h112v240q0 16 16 16h96q16 0 16 -16zM1792 128v896h-1664v-896 h1664zM1920 1024v-896q0 -53 -37.5 -90.5t-90.5 -37.5h-1664q-53 0 -90.5 37.5t-37.5 90.5v896q0 53 37.5 90.5t90.5 37.5h1664q53 0 90.5 -37.5t37.5 -90.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1664 491v616q-169 -91 -306 -91q-82 0 -145 32q-100 49 -184 76.5t-178 27.5q-173 0 -403 -127v-599q245 113 433 113q55 0 103.5 -7.5t98 -26t77 -31t82.5 -39.5l28 -14q44 -22 101 -22q120 0 293 92zM320 1280q0 -35 -17.5 -64t-46.5 -46v-1266q0 -14 -9 -23t-23 -9 h-64q-14 0 -23 9t-9 23v1266q-29 17 -46.5 46t-17.5 64q0 53 37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1792 1216v-763q0 -39 -35 -57q-10 -5 -17 -9q-218 -116 -369 -116q-88 0 -158 35l-28 14q-64 33 -99 48t-91 29t-114 14q-102 0 -235.5 -44t-228.5 -102 q-15 -9 -33 -9q-16 0 -32 8q-32 19 -32 56v742q0 35 31 55q35 21 78.5 42.5t114 52t152.5 49.5t155 19q112 0 209 -31t209 -86q38 -19 89 -19q122 0 310 112q22 12 31 17q31 16 62 -2q31 -20 31 -55z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M832 536v192q-181 -16 -384 -117v-185q205 96 384 110zM832 954v197q-172 -8 -384 -126v-189q215 111 384 118zM1664 491v184q-235 -116 -384 -71v224q-20 6 -39 15q-5 3 -33 17t-34.5 17t-31.5 15t-34.5 15.5t-32.5 13t-36 12.5t-35 8.5t-39.5 7.5t-39.5 4t-44 2 q-23 0 -49 -3v-222h19q102 0 192.5 -29t197.5 -82q19 -9 39 -15v-188q42 -17 91 -17q120 0 293 92zM1664 918v189q-169 -91 -306 -91q-45 0 -78 8v-196q148 -42 384 90zM320 1280q0 -35 -17.5 -64t-46.5 -46v-1266q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v1266 q-29 17 -46.5 46t-17.5 64q0 53 37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1792 1216v-763q0 -39 -35 -57q-10 -5 -17 -9q-218 -116 -369 -116q-88 0 -158 35l-28 14q-64 33 -99 48t-91 29t-114 14q-102 0 -235.5 -44t-228.5 -102q-15 -9 -33 -9q-16 0 -32 8 q-32 19 -32 56v742q0 35 31 55q35 21 78.5 42.5t114 52t152.5 49.5t155 19q112 0 209 -31t209 -86q38 -19 89 -19q122 0 310 112q22 12 31 17q31 16 62 -2q31 -20 31 -55z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M585 553l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23t-10 -23zM1664 96v-64q0 -14 -9 -23t-23 -9h-960q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h960q14 0 23 -9 t9 -23z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M617 137l-50 -50q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l50 -50q10 -10 10 -23t-10 -23l-393 -393l393 -393q10 -10 10 -23t-10 -23zM1208 1204l-373 -1291q-4 -13 -15.5 -19.5t-23.5 -2.5l-62 17q-13 4 -19.5 15.5t-2.5 24.5 l373 1291q4 13 15.5 19.5t23.5 2.5l62 -17q13 -4 19.5 -15.5t2.5 -24.5zM1865 553l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23t-10 -23z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M640 454v-70q0 -42 -39 -59q-13 -5 -25 -5q-27 0 -45 19l-512 512q-19 19 -19 45t19 45l512 512q29 31 70 14q39 -17 39 -59v-69l-397 -398q-19 -19 -19 -45t19 -45zM1792 416q0 -58 -17 -133.5t-38.5 -138t-48 -125t-40.5 -90.5l-20 -40q-8 -17 -28 -17q-6 0 -9 1 q-25 8 -23 34q43 400 -106 565q-64 71 -170.5 110.5t-267.5 52.5v-251q0 -42 -39 -59q-13 -5 -25 -5q-27 0 -45 19l-512 512q-19 19 -19 45t19 45l512 512q29 31 70 14q39 -17 39 -59v-262q411 -28 599 -221q169 -173 169 -509z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1186 579l257 250l-356 52l-66 10l-30 60l-159 322v-963l59 -31l318 -168l-60 355l-12 66zM1638 841l-363 -354l86 -500q5 -33 -6 -51.5t-34 -18.5q-17 0 -40 12l-449 236l-449 -236q-23 -12 -40 -12q-23 0 -34 18.5t-6 51.5l86 500l-364 354q-32 32 -23 59.5t54 34.5 l502 73l225 455q20 41 49 41q28 0 49 -41l225 -455l502 -73q45 -7 54 -34.5t-24 -59.5z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1401 1187l-640 -1280q-17 -35 -57 -35q-5 0 -15 2q-22 5 -35.5 22.5t-13.5 39.5v576h-576q-22 0 -39.5 13.5t-22.5 35.5t4 42t29 30l1280 640q13 7 29 7q27 0 45 -19q15 -14 18.5 -34.5t-6.5 -39.5z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M557 256h595v595zM512 301l595 595h-595v-595zM1664 224v-192q0 -14 -9 -23t-23 -9h-224v-224q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v224h-864q-14 0 -23 9t-9 23v864h-224q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h224v224q0 14 9 23t23 9h192q14 0 23 -9t9 -23 v-224h851l246 247q10 9 23 9t23 -9q9 -10 9 -23t-9 -23l-247 -246v-851h224q14 0 23 -9t9 -23z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M288 64q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM288 1216q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM928 1088q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM1024 1088q0 -52 -26 -96.5t-70 -69.5 q-2 -287 -226 -414q-68 -38 -203 -81q-128 -40 -169.5 -71t-41.5 -100v-26q44 -25 70 -69.5t26 -96.5q0 -80 -56 -136t-136 -56t-136 56t-56 136q0 52 26 96.5t70 69.5v820q-44 25 -70 69.5t-26 96.5q0 80 56 136t136 56t136 -56t56 -136q0 -52 -26 -96.5t-70 -69.5v-497 q54 26 154 57q55 17 87.5 29.5t70.5 31t59 39.5t40.5 51t28 69.5t8.5 91.5q-44 25 -70 69.5t-26 96.5q0 80 56 136t136 56t136 -56t56 -136z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M439 265l-256 -256q-10 -9 -23 -9q-12 0 -23 9q-9 10 -9 23t9 23l256 256q10 9 23 9t23 -9q9 -10 9 -23t-9 -23zM608 224v-320q0 -14 -9 -23t-23 -9t-23 9t-9 23v320q0 14 9 23t23 9t23 -9t9 -23zM384 448q0 -14 -9 -23t-23 -9h-320q-14 0 -23 9t-9 23t9 23t23 9h320 q14 0 23 -9t9 -23zM1648 320q0 -120 -85 -203l-147 -146q-83 -83 -203 -83q-121 0 -204 85l-334 335q-21 21 -42 56l239 18l273 -274q27 -27 68 -27.5t68 26.5l147 146q28 28 28 67q0 40 -28 68l-274 275l18 239q35 -21 56 -42l336 -336q84 -86 84 -204zM1031 1044l-239 -18 l-273 274q-28 28 -68 28q-39 0 -68 -27l-147 -146q-28 -28 -28 -67q0 -40 28 -68l274 -274l-18 -240q-35 21 -56 42l-336 336q-84 86 -84 204q0 120 85 203l147 146q83 83 203 83q121 0 204 -85l334 -335q21 -21 42 -56zM1664 960q0 -14 -9 -23t-23 -9h-320q-14 0 -23 9 t-9 23t9 23t23 9h320q14 0 23 -9t9 -23zM1120 1504v-320q0 -14 -9 -23t-23 -9t-23 9t-9 23v320q0 14 9 23t23 9t23 -9t9 -23zM1527 1353l-256 -256q-11 -9 -23 -9t-23 9q-9 10 -9 23t9 23l256 256q10 9 23 9t23 -9q9 -10 9 -23t-9 -23z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M704 280v-240q0 -16 -12 -28t-28 -12h-240q-16 0 -28 12t-12 28v240q0 16 12 28t28 12h240q16 0 28 -12t12 -28zM1020 880q0 -54 -15.5 -101t-35 -76.5t-55 -59.5t-57.5 -43.5t-61 -35.5q-41 -23 -68.5 -65t-27.5 -67q0 -17 -12 -32.5t-28 -15.5h-240q-15 0 -25.5 18.5 t-10.5 37.5v45q0 83 65 156.5t143 108.5q59 27 84 56t25 76q0 42 -46.5 74t-107.5 32q-65 0 -108 -29q-35 -25 -107 -115q-13 -16 -31 -16q-12 0 -25 8l-164 125q-13 10 -15.5 25t5.5 28q160 266 464 266q80 0 161 -31t146 -83t106 -127.5t41 -158.5z"/>
-   <glyph horiz-adv-x="640" unicode="" d="M640 192v-128q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h64v384h-64q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h384q26 0 45 -19t19 -45v-576h64q26 0 45 -19t19 -45zM512 1344v-192q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v192 q0 26 19 45t45 19h256q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="640" unicode="" d="M512 288v-224q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v224q0 26 19 45t45 19h256q26 0 45 -19t19 -45zM542 1344l-28 -768q-1 -26 -20.5 -45t-45.5 -19h-256q-26 0 -45.5 19t-20.5 45l-28 768q-1 26 17.5 45t44.5 19h320q26 0 44.5 -19t17.5 -45z"/>
-   <glyph d="M897 167v-167h-248l-159 252l-24 42q-8 9 -11 21h-3l-9 -21q-10 -20 -25 -44l-155 -250h-258v167h128l197 291l-185 272h-137v168h276l139 -228q2 -4 23 -42q8 -9 11 -21h3q3 9 11 21l25 42l140 228h257v-168h-125l-184 -267l204 -296h109zM1534 846v-206h-514l-3 27 q-4 28 -4 46q0 64 26 117t65 86.5t84 65t84 54.5t65 54t26 64q0 38 -29.5 62.5t-70.5 24.5q-51 0 -97 -39q-14 -11 -36 -38l-105 92q26 37 63 66q83 65 188 65q110 0 178 -59.5t68 -158.5q0 -56 -24.5 -103t-62 -76.5t-81.5 -58.5t-82 -50.5t-65.5 -51.5t-30.5 -63h232v80 h126z" unicode=""/>
-   <glyph d="M897 167v-167h-248l-159 252l-24 42q-8 9 -11 21h-3l-9 -21q-10 -20 -25 -44l-155 -250h-258v167h128l197 291l-185 272h-137v168h276l139 -228q2 -4 23 -42q8 -9 11 -21h3q3 9 11 21l25 42l140 228h257v-168h-125l-184 -267l204 -296h109zM1536 -50v-206h-514l-4 27 q-3 45 -3 46q0 64 26 117t65 86.5t84 65t84 54.5t65 54t26 64q0 38 -29.5 62.5t-70.5 24.5q-51 0 -97 -39q-14 -11 -36 -38l-105 92q26 37 63 66q80 65 188 65q110 0 178 -59.5t68 -158.5q0 -66 -34.5 -118.5t-84 -86t-99.5 -62.5t-87 -63t-41 -73h232v80h126z" unicode=""/>
-   <glyph horiz-adv-x="1920" unicode="" d="M896 128l336 384h-768l-336 -384h768zM1909 1205q15 -34 9.5 -71.5t-30.5 -65.5l-896 -1024q-38 -44 -96 -44h-768q-38 0 -69.5 20.5t-47.5 54.5q-15 34 -9.5 71.5t30.5 65.5l896 1024q38 44 96 44h768q38 0 69.5 -20.5t47.5 -54.5z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1664 438q0 -81 -44.5 -135t-123.5 -54q-41 0 -77.5 17.5t-59 38t-56.5 38t-71 17.5q-110 0 -110 -124q0 -39 16 -115t15 -115v-5q-22 0 -33 -1q-34 -3 -97.5 -11.5t-115.5 -13.5t-98 -5q-61 0 -103 26.5t-42 83.5q0 37 17.5 71t38 56.5t38 59t17.5 77.5q0 79 -54 123.5 t-135 44.5q-84 0 -143 -45.5t-59 -127.5q0 -43 15 -83t33.5 -64.5t33.5 -53t15 -50.5q0 -45 -46 -89q-37 -35 -117 -35q-95 0 -245 24q-9 2 -27.5 4t-27.5 4l-13 2q-1 0 -3 1q-2 0 -2 1v1024q2 -1 17.5 -3.5t34 -5t21.5 -3.5q150 -24 245 -24q80 0 117 35q46 44 46 89 q0 22 -15 50.5t-33.5 53t-33.5 64.5t-15 83q0 82 59 127.5t144 45.5q80 0 134 -44.5t54 -123.5q0 -41 -17.5 -77.5t-38 -59t-38 -56.5t-17.5 -71q0 -57 42 -83.5t103 -26.5q64 0 180 15t163 17v-2q-1 -2 -3.5 -17.5t-5 -34t-3.5 -21.5q-24 -150 -24 -245q0 -80 35 -117 q44 -46 89 -46q22 0 50.5 15t53 33.5t64.5 33.5t83 15q82 0 127.5 -59t45.5 -143z"/>
-   <glyph horiz-adv-x="1152" unicode="" d="M1152 832v-128q0 -221 -147.5 -384.5t-364.5 -187.5v-132h256q26 0 45 -19t19 -45t-19 -45t-45 -19h-640q-26 0 -45 19t-19 45t19 45t45 19h256v132q-217 24 -364.5 187.5t-147.5 384.5v128q0 26 19 45t45 19t45 -19t19 -45v-128q0 -185 131.5 -316.5t316.5 -131.5 t316.5 131.5t131.5 316.5v128q0 26 19 45t45 19t45 -19t19 -45zM896 1216v-512q0 -132 -94 -226t-226 -94t-226 94t-94 226v512q0 132 94 226t226 94t226 -94t94 -226z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M271 591l-101 -101q-42 103 -42 214v128q0 26 19 45t45 19t45 -19t19 -45v-128q0 -53 15 -113zM1385 1193l-361 -361v-128q0 -132 -94 -226t-226 -94q-55 0 -109 19l-96 -96q97 -51 205 -51q185 0 316.5 131.5t131.5 316.5v128q0 26 19 45t45 19t45 -19t19 -45v-128 q0 -221 -147.5 -384.5t-364.5 -187.5v-132h256q26 0 45 -19t19 -45t-19 -45t-45 -19h-640q-26 0 -45 19t-19 45t19 45t45 19h256v132q-125 13 -235 81l-254 -254q-10 -10 -23 -10t-23 10l-82 82q-10 10 -10 23t10 23l1234 1234q10 10 23 10t23 -10l82 -82q10 -10 10 -23 t-10 -23zM1005 1325l-621 -621v512q0 132 94 226t226 94q102 0 184.5 -59t116.5 -152z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1088 576v640h-448v-1137q119 63 213 137q235 184 235 360zM1280 1344v-768q0 -86 -33.5 -170.5t-83 -150t-118 -127.5t-126.5 -103t-121 -77.5t-89.5 -49.5t-42.5 -20q-12 -6 -26 -6t-26 6q-16 7 -42.5 20t-89.5 49.5t-121 77.5t-126.5 103t-118 127.5t-83 150 t-33.5 170.5v768q0 26 19 45t45 19h1152q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M128 -128h1408v1024h-1408v-1024zM512 1088v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1280 1088v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1664 1152v-1280 q0 -52 -38 -90t-90 -38h-1408q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h128v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h384v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h128q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M512 1344q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 1376v-320q0 -16 -12 -25q-8 -7 -20 -7q-4 0 -7 1l-448 96q-11 2 -18 11t-7 20h-256v-102q111 -23 183.5 -111t72.5 -203v-800q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v800 q0 106 62.5 190.5t161.5 114.5v111h-32q-59 0 -115 -23.5t-91.5 -53t-66 -66.5t-40.5 -53.5t-14 -24.5q-17 -35 -57 -35q-16 0 -29 7q-23 12 -31.5 37t3.5 49q5 10 14.5 26t37.5 53.5t60.5 70t85 67t108.5 52.5q-25 42 -25 86q0 66 47 113t113 47t113 -47t47 -113 q0 -33 -14 -64h302q0 11 7 20t18 11l448 96q3 1 7 1q12 0 20 -7q12 -9 12 -25z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1440 1088q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM1664 1376q0 -249 -75.5 -430.5t-253.5 -360.5q-81 -80 -195 -176l-20 -379q-2 -16 -16 -26l-384 -224q-7 -4 -16 -4q-12 0 -23 9l-64 64q-13 14 -8 32l85 276l-281 281l-276 -85q-3 -1 -9 -1 q-14 0 -23 9l-64 64q-17 19 -5 39l224 384q10 14 26 16l379 20q96 114 176 195q188 187 358 258t431 71q14 0 24 -9.5t10 -22.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1745 763l-164 -763h-334l178 832q13 56 -15 88q-27 33 -83 33h-169l-204 -953h-334l204 953h-286l-204 -953h-334l204 953l-153 327h1276q101 0 189.5 -40.5t147.5 -113.5q60 -73 81 -168.5t0 -194.5z"/>
-   <glyph d="M909 141l102 102q19 19 19 45t-19 45l-307 307l307 307q19 19 19 45t-19 45l-102 102q-19 19 -45 19t-45 -19l-454 -454q-19 -19 -19 -45t19 -45l454 -454q19 -19 45 -19t45 19zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M717 141l454 454q19 19 19 45t-19 45l-454 454q-19 19 -45 19t-45 -19l-102 -102q-19 -19 -19 -45t19 -45l307 -307l-307 -307q-19 -19 -19 -45t19 -45l102 -102q19 -19 45 -19t45 19zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1165 397l102 102q19 19 19 45t-19 45l-454 454q-19 19 -45 19t-45 -19l-454 -454q-19 -19 -19 -45t19 -45l102 -102q19 -19 45 -19t45 19l307 307l307 -307q19 -19 45 -19t45 19zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M813 237l454 454q19 19 19 45t-19 45l-102 102q-19 19 -45 19t-45 -19l-307 -307l-307 307q-19 19 -45 19t-45 -19l-102 -102q-19 -19 -19 -45t19 -45l454 -454q19 -19 45 -19t45 19zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1130 939l16 175h-884l47 -534h612l-22 -228l-197 -53l-196 53l-13 140h-175l22 -278l362 -100h4v1l359 99l50 544h-644l-15 181h674zM0 1408h1408l-128 -1438l-578 -162l-574 162z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M275 1408h1505l-266 -1333l-804 -267l-698 267l71 356h297l-29 -147l422 -161l486 161l68 339h-1208l58 297h1209l38 191h-1208z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M960 1280q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1792 352v-352q0 -22 -20 -30q-8 -2 -12 -2q-13 0 -23 9l-93 93q-119 -143 -318.5 -226.5t-429.5 -83.5t-429.5 83.5t-318.5 226.5l-93 -93q-9 -9 -23 -9q-4 0 -12 2q-20 8 -20 30v352 q0 14 9 23t23 9h352q22 0 30 -20q8 -19 -7 -35l-100 -100q67 -91 189.5 -153.5t271.5 -82.5v647h-192q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h192v163q-58 34 -93 92.5t-35 128.5q0 106 75 181t181 75t181 -75t75 -181q0 -70 -35 -128.5t-93 -92.5v-163h192q26 0 45 -19 t19 -45v-128q0 -26 -19 -45t-45 -19h-192v-647q149 20 271.5 82.5t189.5 153.5l-100 100q-15 16 -7 35q8 20 30 20h352q14 0 23 -9t9 -23z"/>
-   <glyph horiz-adv-x="1152" unicode="" d="M1056 768q40 0 68 -28t28 -68v-576q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v576q0 40 28 68t68 28h32v320q0 185 131.5 316.5t316.5 131.5t316.5 -131.5t131.5 -316.5q0 -26 -19 -45t-45 -19h-64q-26 0 -45 19t-19 45q0 106 -75 181t-181 75t-181 -75t-75 -181 v-320h736z"/>
-   <glyph d="M1024 640q0 -106 -75 -181t-181 -75t-181 75t-75 181t75 181t181 75t181 -75t75 -181zM1152 640q0 159 -112.5 271.5t-271.5 112.5t-271.5 -112.5t-112.5 -271.5t112.5 -271.5t271.5 -112.5t271.5 112.5t112.5 271.5zM1280 640q0 -212 -150 -362t-362 -150t-362 150 t-150 362t150 362t362 150t362 -150t150 -362zM1408 640q0 130 -51 248.5t-136.5 204t-204 136.5t-248.5 51t-248.5 -51t-204 -136.5t-136.5 -204t-51 -248.5t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M384 800v-192q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68zM896 800v-192q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68zM1408 800v-192q0 -40 -28 -68t-68 -28h-192 q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68z"/>
-   <glyph horiz-adv-x="384" unicode="" d="M384 288v-192q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68zM384 800v-192q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68zM384 1312v-192q0 -40 -28 -68t-68 -28h-192 q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68z"/>
-   <glyph d="M512 256q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM863 162q-13 232 -177 396t-396 177q-14 1 -24 -9t-10 -23v-128q0 -13 8.5 -22t21.5 -10q154 -11 264 -121t121 -264q1 -13 10 -21.5t22 -8.5h128q13 0 23 10 t9 24zM1247 161q-5 154 -56 297.5t-139.5 260t-205 205t-260 139.5t-297.5 56q-14 1 -23 -9q-10 -10 -10 -23v-128q0 -13 9 -22t22 -10q204 -7 378 -111.5t278.5 -278.5t111.5 -378q1 -13 10 -22t22 -9h128q13 0 23 10q11 9 9 23zM1536 1120v-960q0 -119 -84.5 -203.5 t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M768 1408q209 0 385.5 -103t279.5 -279.5t103 -385.5t-103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103zM1152 585q32 18 32 55t-32 55l-544 320q-31 19 -64 1q-32 -19 -32 -56v-640q0 -37 32 -56 q16 -8 32 -8q17 0 32 9z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1024 1084l316 -316l-572 -572l-316 316zM813 105l618 618q19 19 19 45t-19 45l-362 362q-18 18 -45 18t-45 -18l-618 -618q-19 -19 -19 -45t19 -45l362 -362q18 -18 45 -18t45 18zM1702 742l-907 -908q-37 -37 -90.5 -37t-90.5 37l-126 126q56 56 56 136t-56 136 t-136 56t-136 -56l-125 126q-37 37 -37 90.5t37 90.5l907 906q37 37 90.5 37t90.5 -37l125 -125q-56 -56 -56 -136t56 -136t136 -56t136 56l126 -125q37 -37 37 -90.5t-37 -90.5z"/>
-   <glyph d="M1280 576v128q0 26 -19 45t-45 19h-896q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h896q26 0 45 19t19 45zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5 t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1152 736v-64q0 -14 -9 -23t-23 -9h-832q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h832q14 0 23 -9t9 -23zM1280 288v832q0 66 -47 113t-113 47h-832q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832q66 0 113 47t47 113zM1408 1120v-832q0 -119 -84.5 -203.5 t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h832q119 0 203.5 -84.5t84.5 -203.5z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M1018 933q-18 -37 -58 -37h-192v-864q0 -14 -9 -23t-23 -9h-704q-21 0 -29 18q-8 20 4 35l160 192q9 11 25 11h320v640h-192q-40 0 -58 37q-17 37 9 68l320 384q18 22 49 22t49 -22l320 -384q27 -32 9 -68z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M32 1280h704q13 0 22.5 -9.5t9.5 -23.5v-863h192q40 0 58 -37t-9 -69l-320 -384q-18 -22 -49 -22t-49 22l-320 384q-26 31 -9 69q18 37 58 37h192v640h-320q-14 0 -25 11l-160 192q-13 14 -4 34q9 19 29 19z"/>
-   <glyph d="M685 237l614 614q19 19 19 45t-19 45l-102 102q-19 19 -45 19t-45 -19l-467 -467l-211 211q-19 19 -45 19t-45 -19l-102 -102q-19 -19 -19 -45t19 -45l358 -358q19 -19 45 -19t45 19zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5 t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M404 428l152 -152l-52 -52h-56v96h-96v56zM818 818q14 -13 -3 -30l-291 -291q-17 -17 -30 -3q-14 13 3 30l291 291q17 17 30 3zM544 128l544 544l-288 288l-544 -544v-288h288zM1152 736l92 92q28 28 28 68t-28 68l-152 152q-28 28 -68 28t-68 -28l-92 -92zM1536 1120 v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M1280 608v480q0 26 -19 45t-45 19h-480q-42 0 -59 -39q-17 -41 14 -70l144 -144l-534 -534q-19 -19 -19 -45t19 -45l102 -102q19 -19 45 -19t45 19l534 534l144 -144q18 -19 45 -19q12 0 25 5q39 17 39 59zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960 q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M1005 435l352 352q19 19 19 45t-19 45l-352 352q-30 31 -69 14q-40 -17 -40 -59v-160q-119 0 -216 -19.5t-162.5 -51t-114 -79t-76.5 -95.5t-44.5 -109t-21.5 -111.5t-5 -110.5q0 -181 167 -404q10 -12 25 -12q7 0 13 3q22 9 19 33q-44 354 62 473q46 52 130 75.5 t224 23.5v-160q0 -42 40 -59q12 -5 24 -5q26 0 45 19zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M640 448l256 128l-256 128v-256zM1024 1039v-542l-512 -256v542zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103 t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1145 861q18 -35 -5 -66l-320 -448q-19 -27 -52 -27t-52 27l-320 448q-23 31 -5 66q17 35 57 35h640q40 0 57 -35zM1280 160v960q0 13 -9.5 22.5t-22.5 9.5h-960q-13 0 -22.5 -9.5t-9.5 -22.5v-960q0 -13 9.5 -22.5t22.5 -9.5h960q13 0 22.5 9.5t9.5 22.5zM1536 1120 v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M1145 419q-17 -35 -57 -35h-640q-40 0 -57 35q-18 35 5 66l320 448q19 27 52 27t52 -27l320 -448q23 -31 5 -66zM1280 160v960q0 13 -9.5 22.5t-22.5 9.5h-960q-13 0 -22.5 -9.5t-9.5 -22.5v-960q0 -13 9.5 -22.5t22.5 -9.5h960q13 0 22.5 9.5t9.5 22.5zM1536 1120v-960 q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M1088 640q0 -33 -27 -52l-448 -320q-31 -23 -66 -5q-35 17 -35 57v640q0 40 35 57q35 18 66 -5l448 -320q27 -19 27 -52zM1280 160v960q0 14 -9 23t-23 9h-960q-14 0 -23 -9t-9 -23v-960q0 -14 9 -23t23 -9h960q14 0 23 9t9 23zM1536 1120v-960q0 -119 -84.5 -203.5 t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1024" unicode="" d="M976 229l35 -159q3 -12 -3 -22.5t-17 -14.5l-5 -1q-4 -2 -10.5 -3.5t-16 -4.5t-21.5 -5.5t-25.5 -5t-30 -5t-33.5 -4.5t-36.5 -3t-38.5 -1q-234 0 -409 130.5t-238 351.5h-95q-13 0 -22.5 9.5t-9.5 22.5v113q0 13 9.5 22.5t22.5 9.5h66q-2 57 1 105h-67q-14 0 -23 9 t-9 23v114q0 14 9 23t23 9h98q67 210 243.5 338t400.5 128q102 0 194 -23q11 -3 20 -15q6 -11 3 -24l-43 -159q-3 -13 -14 -19.5t-24 -2.5l-4 1q-4 1 -11.5 2.5l-17.5 3.5t-22.5 3.5t-26 3t-29 2.5t-29.5 1q-126 0 -226 -64t-150 -176h468q16 0 25 -12q10 -12 7 -26 l-24 -114q-5 -26 -32 -26h-488q-3 -37 0 -105h459q15 0 25 -12q9 -12 6 -27l-24 -112q-2 -11 -11 -18.5t-20 -7.5h-387q48 -117 149.5 -185.5t228.5 -68.5q18 0 36 1.5t33.5 3.5t29.5 4.5t24.5 5t18.5 4.5l12 3l5 2q13 5 26 -2q12 -7 15 -21z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M1020 399v-367q0 -14 -9 -23t-23 -9h-956q-14 0 -23 9t-9 23v150q0 13 9.5 22.5t22.5 9.5h97v383h-95q-14 0 -23 9.5t-9 22.5v131q0 14 9 23t23 9h95v223q0 171 123.5 282t314.5 111q185 0 335 -125q9 -8 10 -20.5t-7 -22.5l-103 -127q-9 -11 -22 -12q-13 -2 -23 7 q-5 5 -26 19t-69 32t-93 18q-85 0 -137 -47t-52 -123v-215h305q13 0 22.5 -9t9.5 -23v-131q0 -13 -9.5 -22.5t-22.5 -9.5h-305v-379h414v181q0 13 9 22.5t23 9.5h162q14 0 23 -9.5t9 -22.5z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M978 351q0 -153 -99.5 -263.5t-258.5 -136.5v-175q0 -14 -9 -23t-23 -9h-135q-13 0 -22.5 9.5t-9.5 22.5v175q-66 9 -127.5 31t-101.5 44.5t-74 48t-46.5 37.5t-17.5 18q-17 21 -2 41l103 135q7 10 23 12q15 2 24 -9l2 -2q113 -99 243 -125q37 -8 74 -8q81 0 142.5 43 t61.5 122q0 28 -15 53t-33.5 42t-58.5 37.5t-66 32t-80 32.5q-39 16 -61.5 25t-61.5 26.5t-62.5 31t-56.5 35.5t-53.5 42.5t-43.5 49t-35.5 58t-21 66.5t-8.5 78q0 138 98 242t255 134v180q0 13 9.5 22.5t22.5 9.5h135q14 0 23 -9t9 -23v-176q57 -6 110.5 -23t87 -33.5 t63.5 -37.5t39 -29t15 -14q17 -18 5 -38l-81 -146q-8 -15 -23 -16q-14 -3 -27 7q-3 3 -14.5 12t-39 26.5t-58.5 32t-74.5 26t-85.5 11.5q-95 0 -155 -43t-60 -111q0 -26 8.5 -48t29.5 -41.5t39.5 -33t56 -31t60.5 -27t70 -27.5q53 -20 81 -31.5t76 -35t75.5 -42.5t62 -50 t53 -63.5t31.5 -76.5t13 -94z"/>
-   <glyph horiz-adv-x="898" unicode="" d="M898 1066v-102q0 -14 -9 -23t-23 -9h-168q-23 -144 -129 -234t-276 -110q167 -178 459 -536q14 -16 4 -34q-8 -18 -29 -18h-195q-16 0 -25 12q-306 367 -498 571q-9 9 -9 22v127q0 13 9.5 22.5t22.5 9.5h112q132 0 212.5 43t102.5 125h-427q-14 0 -23 9t-9 23v102 q0 14 9 23t23 9h413q-57 113 -268 113h-145q-13 0 -22.5 9.5t-9.5 22.5v133q0 14 9 23t23 9h832q14 0 23 -9t9 -23v-102q0 -14 -9 -23t-23 -9h-233q47 -61 64 -144h171q14 0 23 -9t9 -23z"/>
-   <glyph horiz-adv-x="1027" unicode="" d="M603 0h-172q-13 0 -22.5 9t-9.5 23v330h-288q-13 0 -22.5 9t-9.5 23v103q0 13 9.5 22.5t22.5 9.5h288v85h-288q-13 0 -22.5 9t-9.5 23v104q0 13 9.5 22.5t22.5 9.5h214l-321 578q-8 16 0 32q10 16 28 16h194q19 0 29 -18l215 -425q19 -38 56 -125q10 24 30.5 68t27.5 61 l191 420q8 19 29 19h191q17 0 27 -16q9 -14 1 -31l-313 -579h215q13 0 22.5 -9.5t9.5 -22.5v-104q0 -14 -9.5 -23t-22.5 -9h-290v-85h290q13 0 22.5 -9.5t9.5 -22.5v-103q0 -14 -9.5 -23t-22.5 -9h-290v-330q0 -13 -9.5 -22.5t-22.5 -9.5z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1043 971q0 100 -65 162t-171 62h-320v-448h320q106 0 171 62t65 162zM1280 971q0 -193 -126.5 -315t-326.5 -122h-340v-118h505q14 0 23 -9t9 -23v-128q0 -14 -9 -23t-23 -9h-505v-192q0 -14 -9.5 -23t-22.5 -9h-167q-14 0 -23 9t-9 23v192h-224q-14 0 -23 9t-9 23v128 q0 14 9 23t23 9h224v118h-224q-14 0 -23 9t-9 23v149q0 13 9 22.5t23 9.5h224v629q0 14 9 23t23 9h539q200 0 326.5 -122t126.5 -315z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M514 341l81 299h-159l75 -300q1 -1 1 -3t1 -3q0 1 0.5 3.5t0.5 3.5zM630 768l35 128h-292l32 -128h225zM822 768h139l-35 128h-70zM1271 340l78 300h-162l81 -299q0 -1 0.5 -3.5t1.5 -3.5q0 1 0.5 3t0.5 3zM1382 768l33 128h-297l34 -128h230zM1792 736v-64q0 -14 -9 -23 t-23 -9h-213l-164 -616q-7 -24 -31 -24h-159q-24 0 -31 24l-166 616h-209l-167 -616q-7 -24 -31 -24h-159q-11 0 -19.5 7t-10.5 17l-160 616h-208q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h175l-33 128h-142q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h109l-89 344q-5 15 5 28 q10 12 26 12h137q26 0 31 -24l90 -360h359l97 360q7 24 31 24h126q24 0 31 -24l98 -360h365l93 360q5 24 31 24h137q16 0 26 -12q10 -13 5 -28l-91 -344h111q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-145l-34 -128h179q14 0 23 -9t9 -23z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1167 896q18 -182 -131 -258q117 -28 175 -103t45 -214q-7 -71 -32.5 -125t-64.5 -89t-97 -58.5t-121.5 -34.5t-145.5 -15v-255h-154v251q-80 0 -122 1v-252h-154v255q-18 0 -54 0.5t-55 0.5h-200l31 183h111q50 0 58 51v402h16q-6 1 -16 1v287q-13 68 -89 68h-111v164 l212 -1q64 0 97 1v252h154v-247q82 2 122 2v245h154v-252q79 -7 140 -22.5t113 -45t82.5 -78t36.5 -114.5zM952 351q0 36 -15 64t-37 46t-57.5 30.5t-65.5 18.5t-74 9t-69 3t-64.5 -1t-47.5 -1v-338q8 0 37 -0.5t48 -0.5t53 1.5t58.5 4t57 8.5t55.5 14t47.5 21t39.5 30 t24.5 40t9.5 51zM881 827q0 33 -12.5 58.5t-30.5 42t-48 28t-55 16.5t-61.5 8t-58 2.5t-54 -1t-39.5 -0.5v-307q5 0 34.5 -0.5t46.5 0t50 2t55 5.5t51.5 11t48.5 18.5t37 27t27 38.5t9 51z"/>
-   <glyph d="M1024 1024v472q22 -14 36 -28l408 -408q14 -14 28 -36h-472zM896 992q0 -40 28 -68t68 -28h544v-1056q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h800v-544z" unicode=""/>
-   <glyph d="M1468 1060q14 -14 28 -36h-472v472q22 -14 36 -28zM992 896h544v-1056q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h800v-544q0 -40 28 -68t68 -28zM1152 160v64q0 14 -9 23t-23 9h-704q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h704 q14 0 23 9t9 23zM1152 416v64q0 14 -9 23t-23 9h-704q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h704q14 0 23 9t9 23zM1152 672v64q0 14 -9 23t-23 9h-704q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h704q14 0 23 9t9 23z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1191 1128h177l-72 218l-12 47q-2 16 -2 20h-4l-3 -20q0 -1 -3.5 -18t-7.5 -29zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9t9 -23zM1572 -23 v-233h-584v90l369 529q12 18 21 27l11 9v3q-2 0 -6.5 -0.5t-7.5 -0.5q-12 -3 -30 -3h-232v-115h-120v229h567v-89l-369 -530q-6 -8 -21 -26l-11 -11v-2l14 2q9 2 30 2h248v119h121zM1661 874v-106h-288v106h75l-47 144h-243l-47 -144h75v-106h-287v106h70l230 662h162 l230 -662h70z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1191 104h177l-72 218l-12 47q-2 16 -2 20h-4l-3 -20q0 -1 -3.5 -18t-7.5 -29zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9t9 -23zM1661 -150 v-106h-288v106h75l-47 144h-243l-47 -144h75v-106h-287v106h70l230 662h162l230 -662h70zM1572 1001v-233h-584v90l369 529q12 18 21 27l11 9v3q-2 0 -6.5 -0.5t-7.5 -0.5q-12 -3 -30 -3h-232v-115h-120v229h567v-89l-369 -530q-6 -8 -21 -26l-11 -10v-3l14 3q9 1 30 1h248 v119h121z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9t9 -23zM1792 -32v-192q0 -14 -9 -23t-23 -9h-832q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h832 q14 0 23 -9t9 -23zM1600 480v-192q0 -14 -9 -23t-23 -9h-640q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h640q14 0 23 -9t9 -23zM1408 992v-192q0 -14 -9 -23t-23 -9h-448q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h448q14 0 23 -9t9 -23zM1216 1504v-192q0 -14 -9 -23t-23 -9h-256 q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h256q14 0 23 -9t9 -23z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1216 -32v-192q0 -14 -9 -23t-23 -9h-256q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h256q14 0 23 -9t9 -23zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192 q14 0 23 -9t9 -23zM1408 480v-192q0 -14 -9 -23t-23 -9h-448q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h448q14 0 23 -9t9 -23zM1600 992v-192q0 -14 -9 -23t-23 -9h-640q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h640q14 0 23 -9t9 -23zM1792 1504v-192q0 -14 -9 -23t-23 -9h-832 q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h832q14 0 23 -9t9 -23z"/>
-   <glyph d="M1346 223q0 63 -44 116t-103 53q-52 0 -83 -37t-31 -94t36.5 -95t104.5 -38q50 0 85 27t35 68zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9t9 -23 zM1486 165q0 -62 -13 -121.5t-41 -114t-68 -95.5t-98.5 -65.5t-127.5 -24.5q-62 0 -108 16q-24 8 -42 15l39 113q15 -7 31 -11q37 -13 75 -13q84 0 134.5 58.5t66.5 145.5h-2q-21 -23 -61.5 -37t-84.5 -14q-106 0 -173 71.5t-67 172.5q0 105 72 178t181 73q123 0 205 -94.5 t82 -252.5zM1456 882v-114h-469v114h167v432q0 7 0.5 19t0.5 17v16h-2l-7 -12q-8 -13 -26 -31l-62 -58l-82 86l192 185h123v-654h165z" unicode=""/>
-   <glyph d="M1346 1247q0 63 -44 116t-103 53q-52 0 -83 -37t-31 -94t36.5 -95t104.5 -38q50 0 85 27t35 68zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9 t9 -23zM1456 -142v-114h-469v114h167v432q0 7 0.5 19t0.5 17v16h-2l-7 -12q-8 -13 -26 -31l-62 -58l-82 86l192 185h123v-654h165zM1486 1189q0 -62 -13 -121.5t-41 -114t-68 -95.5t-98.5 -65.5t-127.5 -24.5q-62 0 -108 16q-24 8 -42 15l39 113q15 -7 31 -11q37 -13 75 -13 q84 0 134.5 58.5t66.5 145.5h-2q-21 -23 -61.5 -37t-84.5 -14q-106 0 -173 71.5t-67 172.5q0 105 72 178t181 73q123 0 205 -94.5t82 -252.5z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M256 192q0 26 -19 45t-45 19q-27 0 -45.5 -19t-18.5 -45q0 -27 18.5 -45.5t45.5 -18.5q26 0 45 18.5t19 45.5zM416 704v-640q0 -26 -19 -45t-45 -19h-288q-26 0 -45 19t-19 45v640q0 26 19 45t45 19h288q26 0 45 -19t19 -45zM1600 704q0 -86 -55 -149q15 -44 15 -76 q3 -76 -43 -137q17 -56 0 -117q-15 -57 -54 -94q9 -112 -49 -181q-64 -76 -197 -78h-36h-76h-17q-66 0 -144 15.5t-121.5 29t-120.5 39.5q-123 43 -158 44q-26 1 -45 19.5t-19 44.5v641q0 25 18 43.5t43 20.5q24 2 76 59t101 121q68 87 101 120q18 18 31 48t17.5 48.5 t13.5 60.5q7 39 12.5 61t19.5 52t34 50q19 19 45 19q46 0 82.5 -10.5t60 -26t40 -40.5t24 -45t12 -50t5 -45t0.5 -39q0 -38 -9.5 -76t-19 -60t-27.5 -56q-3 -6 -10 -18t-11 -22t-8 -24h277q78 0 135 -57t57 -135z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M256 960q0 -26 -19 -45t-45 -19q-27 0 -45.5 19t-18.5 45q0 27 18.5 45.5t45.5 18.5q26 0 45 -18.5t19 -45.5zM416 448v640q0 26 -19 45t-45 19h-288q-26 0 -45 -19t-19 -45v-640q0 -26 19 -45t45 -19h288q26 0 45 19t19 45zM1545 597q55 -61 55 -149q-1 -78 -57.5 -135 t-134.5 -57h-277q4 -14 8 -24t11 -22t10 -18q18 -37 27 -57t19 -58.5t10 -76.5q0 -24 -0.5 -39t-5 -45t-12 -50t-24 -45t-40 -40.5t-60 -26t-82.5 -10.5q-26 0 -45 19q-20 20 -34 50t-19.5 52t-12.5 61q-9 42 -13.5 60.5t-17.5 48.5t-31 48q-33 33 -101 120q-49 64 -101 121 t-76 59q-25 2 -43 20.5t-18 43.5v641q0 26 19 44.5t45 19.5q35 1 158 44q77 26 120.5 39.5t121.5 29t144 15.5h17h76h36q133 -2 197 -78q58 -69 49 -181q39 -37 54 -94q17 -61 0 -117q46 -61 43 -137q0 -32 -15 -76z"/>
-   <glyph d="M919 233v157q0 50 -29 50q-17 0 -33 -16v-224q16 -16 33 -16q29 0 29 49zM1103 355h66v34q0 51 -33 51t-33 -51v-34zM532 621v-70h-80v-423h-74v423h-78v70h232zM733 495v-367h-67v40q-39 -45 -76 -45q-33 0 -42 28q-6 16 -6 54v290h66v-270q0 -24 1 -26q1 -15 15 -15 q20 0 42 31v280h67zM985 384v-146q0 -52 -7 -73q-12 -42 -53 -42q-35 0 -68 41v-36h-67v493h67v-161q32 40 68 40q41 0 53 -42q7 -21 7 -74zM1236 255v-9q0 -29 -2 -43q-3 -22 -15 -40q-27 -40 -80 -40q-52 0 -81 38q-21 27 -21 86v129q0 59 20 86q29 38 80 38t78 -38 q21 -28 21 -86v-76h-133v-65q0 -51 34 -51q24 0 30 26q0 1 0.5 7t0.5 16.5v21.5h68zM785 1079v-156q0 -51 -32 -51t-32 51v156q0 52 32 52t32 -52zM1318 366q0 177 -19 260q-10 44 -43 73.5t-76 34.5q-136 15 -412 15q-275 0 -411 -15q-44 -5 -76.5 -34.5t-42.5 -73.5 q-20 -87 -20 -260q0 -176 20 -260q10 -43 42.5 -73t75.5 -35q137 -15 412 -15t412 15q43 5 75.5 35t42.5 73q20 84 20 260zM563 1017l90 296h-75l-51 -195l-53 195h-78l24 -69t23 -69q35 -103 46 -158v-201h74v201zM852 936v130q0 58 -21 87q-29 38 -78 38q-51 0 -78 -38 q-21 -29 -21 -87v-130q0 -58 21 -87q27 -38 78 -38q49 0 78 38q21 27 21 87zM1033 816h67v370h-67v-283q-22 -31 -42 -31q-15 0 -16 16q-1 2 -1 26v272h-67v-293q0 -37 6 -55q11 -27 43 -27q36 0 77 45v-40zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960 q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M971 292v-211q0 -67 -39 -67q-23 0 -45 22v301q22 22 45 22q39 0 39 -67zM1309 291v-46h-90v46q0 68 45 68t45 -68zM343 509h107v94h-312v-94h105v-569h100v569zM631 -60h89v494h-89v-378q-30 -42 -57 -42q-18 0 -21 21q-1 3 -1 35v364h-89v-391q0 -49 8 -73 q12 -37 58 -37q48 0 102 61v-54zM1060 88v197q0 73 -9 99q-17 56 -71 56q-50 0 -93 -54v217h-89v-663h89v48q45 -55 93 -55q54 0 71 55q9 27 9 100zM1398 98v13h-91q0 -51 -2 -61q-7 -36 -40 -36q-46 0 -46 69v87h179v103q0 79 -27 116q-39 51 -106 51q-68 0 -107 -51 q-28 -37 -28 -116v-173q0 -79 29 -116q39 -51 108 -51q72 0 108 53q18 27 21 54q2 9 2 58zM790 1011v210q0 69 -43 69t-43 -69v-210q0 -70 43 -70t43 70zM1509 260q0 -234 -26 -350q-14 -59 -58 -99t-102 -46q-184 -21 -555 -21t-555 21q-58 6 -102.5 46t-57.5 99 q-26 112 -26 350q0 234 26 350q14 59 58 99t103 47q183 20 554 20t555 -20q58 -7 102.5 -47t57.5 -99q26 -112 26 -350zM511 1536h102l-121 -399v-271h-100v271q-14 74 -61 212q-37 103 -65 187h106l71 -263zM881 1203v-175q0 -81 -28 -118q-37 -51 -106 -51q-67 0 -105 51 q-28 38 -28 118v175q0 80 28 117q38 51 105 51q69 0 106 -51q28 -37 28 -117zM1216 1365v-499h-91v55q-53 -62 -103 -62q-46 0 -59 37q-8 24 -8 75v394h91v-367q0 -33 1 -35q3 -22 21 -22q27 0 57 43v381h91z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M597 869q-10 -18 -257 -456q-27 -46 -65 -46h-239q-21 0 -31 17t0 36l253 448q1 0 0 1l-161 279q-12 22 -1 37q9 15 32 15h239q40 0 66 -45zM1403 1511q11 -16 0 -37l-528 -934v-1l336 -615q11 -20 1 -37q-10 -15 -32 -15h-239q-42 0 -66 45l-339 622q18 32 531 942 q25 45 64 45h241q22 0 31 -15z"/>
-   <glyph d="M685 771q0 1 -126 222q-21 34 -52 34h-184q-18 0 -26 -11q-7 -12 1 -29l125 -216v-1l-196 -346q-9 -14 0 -28q8 -13 24 -13h185q31 0 50 36zM1309 1268q-7 12 -24 12h-187q-30 0 -49 -35l-411 -729q1 -2 262 -481q20 -35 52 -35h184q18 0 25 12q8 13 -1 28l-260 476v1 l409 723q8 16 0 28zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1280 640q0 37 -30 54l-512 320q-31 20 -65 2q-33 -18 -33 -56v-640q0 -38 33 -56q16 -8 31 -8q20 0 34 10l512 320q30 17 30 54zM1792 640q0 -96 -1 -150t-8.5 -136.5t-22.5 -147.5q-16 -73 -69 -123t-124 -58q-222 -25 -671 -25t-671 25q-71 8 -124.5 58t-69.5 123 q-14 65 -21.5 147.5t-8.5 136.5t-1 150t1 150t8.5 136.5t22.5 147.5q16 73 69 123t124 58q222 25 671 25t671 -25q71 -8 124.5 -58t69.5 -123q14 -65 21.5 -147.5t8.5 -136.5t1 -150z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M402 829l494 -305l-342 -285l-490 319zM1388 274v-108l-490 -293v-1l-1 1l-1 -1v1l-489 293v108l147 -96l342 284v2l1 -1l1 1v-2l343 -284zM554 1418l342 -285l-494 -304l-338 270zM1390 829l338 -271l-489 -319l-343 285zM1239 1418l489 -319l-338 -270l-494 304z"/>
-   <glyph d="M1289 -96h-1118v480h-160v-640h1438v640h-160v-480zM347 428l33 157l783 -165l-33 -156zM450 802l67 146l725 -339l-67 -145zM651 1158l102 123l614 -513l-102 -123zM1048 1536l477 -641l-128 -96l-477 641zM330 65v159h800v-159h-800z" unicode=""/>
-   <glyph d="M1362 110v648h-135q20 -63 20 -131q0 -126 -64 -232.5t-174 -168.5t-240 -62q-197 0 -337 135.5t-140 327.5q0 68 20 131h-141v-648q0 -26 17.5 -43.5t43.5 -17.5h1069q25 0 43 17.5t18 43.5zM1078 643q0 124 -90.5 211.5t-218.5 87.5q-127 0 -217.5 -87.5t-90.5 -211.5 t90.5 -211.5t217.5 -87.5q128 0 218.5 87.5t90.5 211.5zM1362 1003v165q0 28 -20 48.5t-49 20.5h-174q-29 0 -49 -20.5t-20 -48.5v-165q0 -29 20 -49t49 -20h174q29 0 49 20t20 49zM1536 1211v-1142q0 -81 -58 -139t-139 -58h-1142q-81 0 -139 58t-58 139v1142q0 81 58 139 t139 58h1142q81 0 139 -58t58 -139z" unicode=""/>
-   <glyph d="M1248 1408q119 0 203.5 -84.5t84.5 -203.5v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960zM698 640q0 88 -62 150t-150 62t-150 -62t-62 -150t62 -150t150 -62t150 62t62 150zM1262 640q0 88 -62 150 t-150 62t-150 -62t-62 -150t62 -150t150 -62t150 62t62 150z" unicode=""/>
-   <glyph d="M768 914l201 -306h-402zM1133 384h94l-459 691l-459 -691h94l104 160h522zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M815 677q8 -63 -50.5 -101t-111.5 -6q-39 17 -53.5 58t-0.5 82t52 58q36 18 72.5 12t64 -35.5t27.5 -67.5zM926 698q-14 107 -113 164t-197 13q-63 -28 -100.5 -88.5t-34.5 -129.5q4 -91 77.5 -155t165.5 -56q91 8 152 84t50 168zM1165 1240q-20 27 -56 44.5t-58 22 t-71 12.5q-291 47 -566 -2q-43 -7 -66 -12t-55 -22t-50 -43q30 -28 76 -45.5t73.5 -22t87.5 -11.5q228 -29 448 -1q63 8 89.5 12t72.5 21.5t75 46.5zM1222 205q-8 -26 -15.5 -76.5t-14 -84t-28.5 -70t-58 -56.5q-86 -48 -189.5 -71.5t-202 -22t-201.5 18.5q-46 8 -81.5 18 t-76.5 27t-73 43.5t-52 61.5q-25 96 -57 292l6 16l18 9q223 -148 506.5 -148t507.5 148q21 -6 24 -23t-5 -45t-8 -37zM1403 1166q-26 -167 -111 -655q-5 -30 -27 -56t-43.5 -40t-54.5 -31q-252 -126 -610 -88q-248 27 -394 139q-15 12 -25.5 26.5t-17 35t-9 34t-6 39.5 t-5.5 35q-9 50 -26.5 150t-28 161.5t-23.5 147.5t-22 158q3 26 17.5 48.5t31.5 37.5t45 30t46 22.5t48 18.5q125 46 313 64q379 37 676 -50q155 -46 215 -122q16 -20 16.5 -51t-5.5 -54z"/>
-   <glyph d="M848 666q0 43 -41 66t-77 1q-43 -20 -42.5 -72.5t43.5 -70.5q39 -23 81 4t36 72zM928 682q8 -66 -36 -121t-110 -61t-119 40t-56 113q-2 49 25.5 93t72.5 64q70 31 141.5 -10t81.5 -118zM1100 1073q-20 -21 -53.5 -34t-53 -16t-63.5 -8q-155 -20 -324 0q-44 6 -63 9.5 t-52.5 16t-54.5 32.5q13 19 36 31t40 15.5t47 8.5q198 35 408 1q33 -5 51 -8.5t43 -16t39 -31.5zM1142 327q0 7 5.5 26.5t3 32t-17.5 16.5q-161 -106 -365 -106t-366 106l-12 -6l-5 -12q26 -154 41 -210q47 -81 204 -108q249 -46 428 53q34 19 49 51.5t22.5 85.5t12.5 71z M1272 1020q9 53 -8 75q-43 55 -155 88q-216 63 -487 36q-132 -12 -226 -46q-38 -15 -59.5 -25t-47 -34t-29.5 -54q8 -68 19 -138t29 -171t24 -137q1 -5 5 -31t7 -36t12 -27t22 -28q105 -80 284 -100q259 -28 440 63q24 13 39.5 23t31 29t19.5 40q48 267 80 473zM1536 1120 v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1024" unicode="" d="M944 207l80 -237q-23 -35 -111 -66t-177 -32q-104 -2 -190.5 26t-142.5 74t-95 106t-55.5 120t-16.5 118v544h-168v215q72 26 129 69.5t91 90t58 102t34 99t15 88.5q1 5 4.5 8.5t7.5 3.5h244v-424h333v-252h-334v-518q0 -30 6.5 -56t22.5 -52.5t49.5 -41.5t81.5 -14 q78 2 134 29z"/>
-   <glyph d="M1136 75l-62 183q-44 -22 -103 -22q-36 -1 -62 10.5t-38.5 31.5t-17.5 40.5t-5 43.5v398h257v194h-256v326h-188q-8 0 -9 -10q-5 -44 -17.5 -87t-39 -95t-77 -95t-118.5 -68v-165h130v-418q0 -57 21.5 -115t65 -111t121 -85.5t176.5 -30.5q69 1 136.5 25t85.5 50z M1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="768" unicode="" d="M765 237q8 -19 -5 -35l-350 -384q-10 -10 -23 -10q-14 0 -24 10l-355 384q-13 16 -5 35q9 19 29 19h224v1248q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1248h224q21 0 29 -19z"/>
-   <glyph horiz-adv-x="768" unicode="" d="M765 1043q-9 -19 -29 -19h-224v-1248q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v1248h-224q-21 0 -29 19t5 35l350 384q10 10 23 10q14 0 24 -10l355 -384q13 -16 5 -35z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 736v-192q0 -14 -9 -23t-23 -9h-1248v-224q0 -21 -19 -29t-35 5l-384 350q-10 10 -10 23q0 14 10 24l384 354q16 14 35 6q19 -9 19 -29v-224h1248q14 0 23 -9t9 -23z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1728 643q0 -14 -10 -24l-384 -354q-16 -14 -35 -6q-19 9 -19 29v224h-1248q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h1248v224q0 21 19 29t35 -5l384 -350q10 -10 10 -23z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1393 321q-39 -125 -123 -250q-129 -196 -257 -196q-49 0 -140 32q-86 32 -151 32q-61 0 -142 -33q-81 -34 -132 -34q-152 0 -301 259q-147 261 -147 503q0 228 113 374q112 144 284 144q72 0 177 -30q104 -30 138 -30q45 0 143 34q102 34 173 34q119 0 213 -65 q52 -36 104 -100q-79 -67 -114 -118q-65 -94 -65 -207q0 -124 69 -223t158 -126zM1017 1494q0 -61 -29 -136q-30 -75 -93 -138q-54 -54 -108 -72q-37 -11 -104 -17q3 149 78 257q74 107 250 148q1 -3 2.5 -11t2.5 -11q0 -4 0.5 -10t0.5 -10z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M682 530v-651l-682 94v557h682zM682 1273v-659h-682v565zM1664 530v-786l-907 125v661h907zM1664 1408v-794h-907v669z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M493 1053q16 0 27.5 11.5t11.5 27.5t-11.5 27.5t-27.5 11.5t-27 -11.5t-11 -27.5t11 -27.5t27 -11.5zM915 1053q16 0 27 11.5t11 27.5t-11 27.5t-27 11.5t-27.5 -11.5t-11.5 -27.5t11.5 -27.5t27.5 -11.5zM103 869q42 0 72 -30t30 -72v-430q0 -43 -29.5 -73t-72.5 -30 t-73 30t-30 73v430q0 42 30 72t73 30zM1163 850v-666q0 -46 -32 -78t-77 -32h-75v-227q0 -43 -30 -73t-73 -30t-73 30t-30 73v227h-138v-227q0 -43 -30 -73t-73 -30q-42 0 -72 30t-30 73l-1 227h-74q-46 0 -78 32t-32 78v666h918zM931 1255q107 -55 171 -153.5t64 -215.5 h-925q0 117 64 215.5t172 153.5l-71 131q-7 13 5 20q13 6 20 -6l72 -132q95 42 201 42t201 -42l72 132q7 12 20 6q12 -7 5 -20zM1408 767v-430q0 -43 -30 -73t-73 -30q-42 0 -72 30t-30 73v430q0 43 30 72.5t72 29.5q43 0 73 -29.5t30 -72.5z"/>
-   <glyph d="M663 1125q-11 -1 -15.5 -10.5t-8.5 -9.5q-5 -1 -5 5q0 12 19 15h10zM750 1111q-4 -1 -11.5 6.5t-17.5 4.5q24 11 32 -2q3 -6 -3 -9zM399 684q-4 1 -6 -3t-4.5 -12.5t-5.5 -13.5t-10 -13q-7 -10 -1 -12q4 -1 12.5 7t12.5 18q1 3 2 7t2 6t1.5 4.5t0.5 4v3t-1 2.5t-3 2z M1254 325q0 18 -55 42q4 15 7.5 27.5t5 26t3 21.5t0.5 22.5t-1 19.5t-3.5 22t-4 20.5t-5 25t-5.5 26.5q-10 48 -47 103t-72 75q24 -20 57 -83q87 -162 54 -278q-11 -40 -50 -42q-31 -4 -38.5 18.5t-8 83.5t-11.5 107q-9 39 -19.5 69t-19.5 45.5t-15.5 24.5t-13 15t-7.5 7 q-14 62 -31 103t-29.5 56t-23.5 33t-15 40q-4 21 6 53.5t4.5 49.5t-44.5 25q-15 3 -44.5 18t-35.5 16q-8 1 -11 26t8 51t36 27q37 3 51 -30t4 -58q-11 -19 -2 -26.5t30 -0.5q13 4 13 36v37q-5 30 -13.5 50t-21 30.5t-23.5 15t-27 7.5q-107 -8 -89 -134q0 -15 -1 -15 q-9 9 -29.5 10.5t-33 -0.5t-15.5 5q1 57 -16 90t-45 34q-27 1 -41.5 -27.5t-16.5 -59.5q-1 -15 3.5 -37t13 -37.5t15.5 -13.5q10 3 16 14q4 9 -7 8q-7 0 -15.5 14.5t-9.5 33.5q-1 22 9 37t34 14q17 0 27 -21t9.5 -39t-1.5 -22q-22 -15 -31 -29q-8 -12 -27.5 -23.5 t-20.5 -12.5q-13 -14 -15.5 -27t7.5 -18q14 -8 25 -19.5t16 -19t18.5 -13t35.5 -6.5q47 -2 102 15q2 1 23 7t34.5 10.5t29.5 13t21 17.5q9 14 20 8q5 -3 6.5 -8.5t-3 -12t-16.5 -9.5q-20 -6 -56.5 -21.5t-45.5 -19.5q-44 -19 -70 -23q-25 -5 -79 2q-10 2 -9 -2t17 -19 q25 -23 67 -22q17 1 36 7t36 14t33.5 17.5t30 17t24.5 12t17.5 2.5t8.5 -11q0 -2 -1 -4.5t-4 -5t-6 -4.5t-8.5 -5t-9 -4.5t-10 -5t-9.5 -4.5q-28 -14 -67.5 -44t-66.5 -43t-49 -1q-21 11 -63 73q-22 31 -25 22q-1 -3 -1 -10q0 -25 -15 -56.5t-29.5 -55.5t-21 -58t11.5 -63 q-23 -6 -62.5 -90t-47.5 -141q-2 -18 -1.5 -69t-5.5 -59q-8 -24 -29 -3q-32 31 -36 94q-2 28 4 56q4 19 -1 18l-4 -5q-36 -65 10 -166q5 -12 25 -28t24 -20q20 -23 104 -90.5t93 -76.5q16 -15 17.5 -38t-14 -43t-45.5 -23q8 -15 29 -44.5t28 -54t7 -70.5q46 24 7 92 q-4 8 -10.5 16t-9.5 12t-2 6q3 5 13 9.5t20 -2.5q46 -52 166 -36q133 15 177 87q23 38 34 30q12 -6 10 -52q-1 -25 -23 -92q-9 -23 -6 -37.5t24 -15.5q3 19 14.5 77t13.5 90q2 21 -6.5 73.5t-7.5 97t23 70.5q15 18 51 18q1 37 34.5 53t72.5 10.5t60 -22.5zM626 1152 q3 17 -2.5 30t-11.5 15q-9 2 -9 -7q2 -5 5 -6q10 0 7 -15q-3 -20 8 -20q3 0 3 3zM1045 955q-2 8 -6.5 11.5t-13 5t-14.5 5.5q-5 3 -9.5 8t-7 8t-5.5 6.5t-4 4t-4 -1.5q-14 -16 7 -43.5t39 -31.5q9 -1 14.5 8t3.5 20zM867 1168q0 11 -5 19.5t-11 12.5t-9 3q-14 -1 -7 -7l4 -2 q14 -4 18 -31q0 -3 8 2zM921 1401q0 2 -2.5 5t-9 7t-9.5 6q-15 15 -24 15q-9 -1 -11.5 -7.5t-1 -13t-0.5 -12.5q-1 -4 -6 -10.5t-6 -9t3 -8.5q4 -3 8 0t11 9t15 9q1 1 9 1t15 2t9 7zM1486 60q20 -12 31 -24.5t12 -24t-2.5 -22.5t-15.5 -22t-23.5 -19.5t-30 -18.5 t-31.5 -16.5t-32 -15.5t-27 -13q-38 -19 -85.5 -56t-75.5 -64q-17 -16 -68 -19.5t-89 14.5q-18 9 -29.5 23.5t-16.5 25.5t-22 19.5t-47 9.5q-44 1 -130 1q-19 0 -57 -1.5t-58 -2.5q-44 -1 -79.5 -15t-53.5 -30t-43.5 -28.5t-53.5 -11.5q-29 1 -111 31t-146 43q-19 4 -51 9.5 t-50 9t-39.5 9.5t-33.5 14.5t-17 19.5q-10 23 7 66.5t18 54.5q1 16 -4 40t-10 42.5t-4.5 36.5t10.5 27q14 12 57 14t60 12q30 18 42 35t12 51q21 -73 -32 -106q-32 -20 -83 -15q-34 3 -43 -10q-13 -15 5 -57q2 -6 8 -18t8.5 -18t4.5 -17t1 -22q0 -15 -17 -49t-14 -48 q3 -17 37 -26q20 -6 84.5 -18.5t99.5 -20.5q24 -6 74 -22t82.5 -23t55.5 -4q43 6 64.5 28t23 48t-7.5 58.5t-19 52t-20 36.5q-121 190 -169 242q-68 74 -113 40q-11 -9 -15 15q-3 16 -2 38q1 29 10 52t24 47t22 42q8 21 26.5 72t29.5 78t30 61t39 54q110 143 124 195 q-12 112 -16 310q-2 90 24 151.5t106 104.5q39 21 104 21q53 1 106 -13.5t89 -41.5q57 -42 91.5 -121.5t29.5 -147.5q-5 -95 30 -214q34 -113 133 -218q55 -59 99.5 -163t59.5 -191q8 -49 5 -84.5t-12 -55.5t-20 -22q-10 -2 -23.5 -19t-27 -35.5t-40.5 -33.5t-61 -14 q-18 1 -31.5 5t-22.5 13.5t-13.5 15.5t-11.5 20.5t-9 19.5q-22 37 -41 30t-28 -49t7 -97q20 -70 1 -195q-10 -65 18 -100.5t73 -33t85 35.5q59 49 89.5 66.5t103.5 42.5q53 18 77 36.5t18.5 34.5t-25 28.5t-51.5 23.5q-33 11 -49.5 48t-15 72.5t15.5 47.5q1 -31 8 -56.5 t14.5 -40.5t20.5 -28.5t21 -19t21.5 -13t16.5 -9.5z" unicode=""/>
-   <glyph d="M1024 36q-42 241 -140 498h-2l-2 -1q-16 -6 -43 -16.5t-101 -49t-137 -82t-131 -114.5t-103 -148l-15 11q184 -150 418 -150q132 0 256 52zM839 643q-21 49 -53 111q-311 -93 -673 -93q-1 -7 -1 -21q0 -124 44 -236.5t124 -201.5q50 89 123.5 166.5t142.5 124.5t130.5 81 t99.5 48l37 13q4 1 13 3.5t13 4.5zM732 855q-120 213 -244 378q-138 -65 -234 -186t-128 -272q302 0 606 80zM1416 536q-210 60 -409 29q87 -239 128 -469q111 75 185 189.5t96 250.5zM611 1277q-1 0 -2 -1q1 1 2 1zM1201 1132q-185 164 -433 164q-76 0 -155 -19 q131 -170 246 -382q69 26 130 60.5t96.5 61.5t65.5 57t37.5 40.5zM1424 647q-3 232 -149 410l-1 -1q-9 -12 -19 -24.5t-43.5 -44.5t-71 -60.5t-100 -65t-131.5 -64.5q25 -53 44 -95q2 -6 6.5 -17.5t7.5 -16.5q36 5 74.5 7t73.5 2t69 -1.5t64 -4t56.5 -5.5t48 -6.5t36.5 -6 t25 -4.5zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1173 473q0 50 -19.5 91.5t-48.5 68.5t-73 49t-82.5 34t-87.5 23l-104 24q-30 7 -44 10.5t-35 11.5t-30 16t-16.5 21t-7.5 30q0 77 144 77q43 0 77 -12t54 -28.5t38 -33.5t40 -29t48 -12q47 0 75.5 32t28.5 77q0 55 -56 99.5t-142 67.5t-182 23q-68 0 -132 -15.5 t-119.5 -47t-89 -87t-33.5 -128.5q0 -61 19 -106.5t56 -75.5t80 -48.5t103 -32.5l146 -36q90 -22 112 -36q32 -20 32 -60q0 -39 -40 -64.5t-105 -25.5q-51 0 -91.5 16t-65 38.5t-45.5 45t-46 38.5t-54 16q-50 0 -75.5 -30t-25.5 -75q0 -92 122 -157.5t291 -65.5 q73 0 140 18.5t122.5 53.5t88.5 93.5t33 131.5zM1536 256q0 -159 -112.5 -271.5t-271.5 -112.5q-130 0 -234 80q-77 -16 -150 -16q-143 0 -273.5 55.5t-225 150t-150 225t-55.5 273.5q0 73 16 150q-80 104 -80 234q0 159 112.5 271.5t271.5 112.5q130 0 234 -80 q77 16 150 16q143 0 273.5 -55.5t225 -150t150 -225t55.5 -273.5q0 -73 -16 -150q80 -104 80 -234z" unicode=""/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1000 1102l37 194q5 23 -9 40t-35 17h-712q-23 0 -38.5 -17t-15.5 -37v-1101q0 -7 6 -1l291 352q23 26 38 33.5t48 7.5h239q22 0 37 14.5t18 29.5q24 130 37 191q4 21 -11.5 40t-36.5 19h-294q-29 0 -48 19t-19 48v42q0 29 19 47.5t48 18.5h346q18 0 35 13.5t20 29.5z M1227 1324q-15 -73 -53.5 -266.5t-69.5 -350t-35 -173.5q-6 -22 -9 -32.5t-14 -32.5t-24.5 -33t-38.5 -21t-58 -10h-271q-13 0 -22 -10q-8 -9 -426 -494q-22 -25 -58.5 -28.5t-48.5 5.5q-55 22 -55 98v1410q0 55 38 102.5t120 47.5h888q95 0 127 -53t10 -159zM1227 1324 l-158 -790q4 17 35 173.5t69.5 350t53.5 266.5z"/>
-   <glyph d="M704 192v1024q0 14 -9 23t-23 9h-480q-14 0 -23 -9t-9 -23v-1024q0 -14 9 -23t23 -9h480q14 0 23 9t9 23zM1376 576v640q0 14 -9 23t-23 9h-480q-14 0 -23 -9t-9 -23v-640q0 -14 9 -23t23 -9h480q14 0 23 9t9 23zM1536 1344v-1408q0 -26 -19 -45t-45 -19h-1408 q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h1408q26 0 45 -19t19 -45z" unicode=""/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1280 480q0 -40 -28 -68t-68 -28q-51 0 -80 43l-227 341h-45v-132l247 -411q9 -15 9 -33q0 -26 -19 -45t-45 -19h-192v-272q0 -46 -33 -79t-79 -33h-160q-46 0 -79 33t-33 79v272h-192q-26 0 -45 19t-19 45q0 18 9 33l247 411v132h-45l-227 -341q-29 -43 -80 -43 q-40 0 -68 28t-28 68q0 29 16 53l256 384q73 107 176 107h384q103 0 176 -107l256 -384q16 -24 16 -53zM864 1280q0 -93 -65.5 -158.5t-158.5 -65.5t-158.5 65.5t-65.5 158.5t65.5 158.5t158.5 65.5t158.5 -65.5t65.5 -158.5z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M1024 832v-416q0 -40 -28 -68t-68 -28t-68 28t-28 68v352h-64v-912q0 -46 -33 -79t-79 -33t-79 33t-33 79v464h-64v-464q0 -46 -33 -79t-79 -33t-79 33t-33 79v912h-64v-352q0 -40 -28 -68t-68 -28t-68 28t-28 68v416q0 80 56 136t136 56h640q80 0 136 -56t56 -136z M736 1280q0 -93 -65.5 -158.5t-158.5 -65.5t-158.5 65.5t-65.5 158.5t65.5 158.5t158.5 65.5t158.5 -65.5t65.5 -158.5z"/>
-   <glyph d="M773 234l350 473q16 22 24.5 59t-6 85t-61.5 79q-40 26 -83 25.5t-73.5 -17.5t-54.5 -45q-36 -40 -96 -40q-59 0 -95 40q-24 28 -54.5 45t-73.5 17.5t-84 -25.5q-46 -31 -60.5 -79t-6 -85t24.5 -59zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103 t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1472 640q0 117 -45.5 223.5t-123 184t-184 123t-223.5 45.5t-223.5 -45.5t-184 -123t-123 -184t-45.5 -223.5t45.5 -223.5t123 -184t184 -123t223.5 -45.5t223.5 45.5t184 123t123 184t45.5 223.5zM1748 363q-4 -15 -20 -20l-292 -96v-306q0 -16 -13 -26q-15 -10 -29 -4 l-292 94l-180 -248q-10 -13 -26 -13t-26 13l-180 248l-292 -94q-14 -6 -29 4q-13 10 -13 26v306l-292 96q-16 5 -20 20q-5 17 4 29l180 248l-180 248q-9 13 -4 29q4 15 20 20l292 96v306q0 16 13 26q15 10 29 4l292 -94l180 248q9 12 26 12t26 -12l180 -248l292 94 q14 6 29 -4q13 -10 13 -26v-306l292 -96q16 -5 20 -20q5 -16 -4 -29l-180 -248l180 -248q9 -12 4 -29z"/>
-   <glyph d="M1262 233q-54 -9 -110 -9q-182 0 -337 90t-245 245t-90 337q0 192 104 357q-201 -60 -328.5 -229t-127.5 -384q0 -130 51 -248.5t136.5 -204t204 -136.5t248.5 -51q144 0 273.5 61.5t220.5 171.5zM1465 318q-94 -203 -283.5 -324.5t-413.5 -121.5q-156 0 -298 61 t-245 164t-164 245t-61 298q0 153 57.5 292.5t156 241.5t235.5 164.5t290 68.5q44 2 61 -39q18 -41 -15 -72q-86 -78 -131.5 -181.5t-45.5 -218.5q0 -148 73 -273t198 -198t273 -73q118 0 228 51q41 18 72 -13q14 -14 17.5 -34t-4.5 -38z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1088 704q0 26 -19 45t-45 19h-256q-26 0 -45 -19t-19 -45t19 -45t45 -19h256q26 0 45 19t19 45zM1664 896v-960q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v960q0 26 19 45t45 19h1408q26 0 45 -19t19 -45zM1728 1344v-256q0 -26 -19 -45t-45 -19h-1536 q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h1536q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1632 576q0 -26 -19 -45t-45 -19h-224q0 -171 -67 -290l208 -209q19 -19 19 -45t-19 -45q-18 -19 -45 -19t-45 19l-198 197q-5 -5 -15 -13t-42 -28.5t-65 -36.5t-82 -29t-97 -13v896h-128v-896q-51 0 -101.5 13.5t-87 33t-66 39t-43.5 32.5l-15 14l-183 -207 q-20 -21 -48 -21q-24 0 -43 16q-19 18 -20.5 44.5t15.5 46.5l202 227q-58 114 -58 274h-224q-26 0 -45 19t-19 45t19 45t45 19h224v294l-173 173q-19 19 -19 45t19 45t45 19t45 -19l173 -173h844l173 173q19 19 45 19t45 -19t19 -45t-19 -45l-173 -173v-294h224q26 0 45 -19 t19 -45zM1152 1152h-640q0 133 93.5 226.5t226.5 93.5t226.5 -93.5t93.5 -226.5z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1917 1016q23 -64 -150 -294q-24 -32 -65 -85q-78 -100 -90 -131q-17 -41 14 -81q17 -21 81 -82h1l1 -1l1 -1l2 -2q141 -131 191 -221q3 -5 6.5 -12.5t7 -26.5t-0.5 -34t-25 -27.5t-59 -12.5l-256 -4q-24 -5 -56 5t-52 22l-20 12q-30 21 -70 64t-68.5 77.5t-61 58 t-56.5 15.5q-3 -1 -8 -3.5t-17 -14.5t-21.5 -29.5t-17 -52t-6.5 -77.5q0 -15 -3.5 -27.5t-7.5 -18.5l-4 -5q-18 -19 -53 -22h-115q-71 -4 -146 16.5t-131.5 53t-103 66t-70.5 57.5l-25 24q-10 10 -27.5 30t-71.5 91t-106 151t-122.5 211t-130.5 272q-6 16 -6 27t3 16l4 6 q15 19 57 19l274 2q12 -2 23 -6.5t16 -8.5l5 -3q16 -11 24 -32q20 -50 46 -103.5t41 -81.5l16 -29q29 -60 56 -104t48.5 -68.5t41.5 -38.5t34 -14t27 5q2 1 5 5t12 22t13.5 47t9.5 81t0 125q-2 40 -9 73t-14 46l-6 12q-25 34 -85 43q-13 2 5 24q17 19 38 30q53 26 239 24 q82 -1 135 -13q20 -5 33.5 -13.5t20.5 -24t10.5 -32t3.5 -45.5t-1 -55t-2.5 -70.5t-1.5 -82.5q0 -11 -1 -42t-0.5 -48t3.5 -40.5t11.5 -39t22.5 -24.5q8 -2 17 -4t26 11t38 34.5t52 67t68 107.5q60 104 107 225q4 10 10 17.5t11 10.5l4 3l5 2.5t13 3t20 0.5l288 2 q39 5 64 -2.5t31 -16.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M675 252q21 34 11 69t-45 50q-34 14 -73 1t-60 -46q-22 -34 -13 -68.5t43 -50.5t74.5 -2.5t62.5 47.5zM769 373q8 13 3.5 26.5t-17.5 18.5q-14 5 -28.5 -0.5t-21.5 -18.5q-17 -31 13 -45q14 -5 29 0.5t22 18.5zM943 266q-45 -102 -158 -150t-224 -12 q-107 34 -147.5 126.5t6.5 187.5q47 93 151.5 139t210.5 19q111 -29 158.5 -119.5t2.5 -190.5zM1255 426q-9 96 -89 170t-208.5 109t-274.5 21q-223 -23 -369.5 -141.5t-132.5 -264.5q9 -96 89 -170t208.5 -109t274.5 -21q223 23 369.5 141.5t132.5 264.5zM1563 422 q0 -68 -37 -139.5t-109 -137t-168.5 -117.5t-226 -83t-270.5 -31t-275 33.5t-240.5 93t-171.5 151t-65 199.5q0 115 69.5 245t197.5 258q169 169 341.5 236t246.5 -7q65 -64 20 -209q-4 -14 -1 -20t10 -7t14.5 0.5t13.5 3.5l6 2q139 59 246 59t153 -61q45 -63 0 -178 q-2 -13 -4.5 -20t4.5 -12.5t12 -7.5t17 -6q57 -18 103 -47t80 -81.5t34 -116.5zM1489 1046q42 -47 54.5 -108.5t-6.5 -117.5q-8 -23 -29.5 -34t-44.5 -4q-23 8 -34 29.5t-4 44.5q20 63 -24 111t-107 35q-24 -5 -45 8t-25 37q-5 24 8 44.5t37 25.5q60 13 119 -5.5t101 -65.5z M1670 1209q87 -96 112.5 -222.5t-13.5 -241.5q-9 -27 -34 -40t-52 -4t-40 34t-5 52q28 82 10 172t-80 158q-62 69 -148 95.5t-173 8.5q-28 -6 -52 9.5t-30 43.5t9.5 51.5t43.5 29.5q123 26 244 -11.5t208 -134.5z"/>
-   <glyph d="M1133 -34q-171 -94 -368 -94q-196 0 -367 94q138 87 235.5 211t131.5 268q35 -144 132.5 -268t235.5 -211zM638 1394v-485q0 -252 -126.5 -459.5t-330.5 -306.5q-181 215 -181 495q0 187 83.5 349.5t229.5 269.5t325 137zM1536 638q0 -280 -181 -495 q-204 99 -330.5 306.5t-126.5 459.5v485q179 -30 325 -137t229.5 -269.5t83.5 -349.5z" unicode=""/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1402 433q-32 -80 -76 -138t-91 -88.5t-99 -46.5t-101.5 -14.5t-96.5 8.5t-86.5 22t-69.5 27.5t-46 22.5l-17 10q-113 -228 -289.5 -359.5t-384.5 -132.5q-19 0 -32 13t-13 32t13 31.5t32 12.5q173 1 322.5 107.5t251.5 294.5q-36 -14 -72 -23t-83 -13t-91 2.5t-93 28.5 t-92 59t-84.5 100t-74.5 146q114 47 214 57t167.5 -7.5t124.5 -56.5t88.5 -77t56.5 -82q53 131 79 291q-7 -1 -18 -2.5t-46.5 -2.5t-69.5 0.5t-81.5 10t-88.5 23t-84 42.5t-75 65t-54.5 94.5t-28.5 127.5q70 28 133.5 36.5t112.5 -1t92 -30t73.5 -50t56 -61t42 -63t27.5 -56 t16 -39.5l4 -16q12 122 12 195q-8 6 -21.5 16t-49 44.5t-63.5 71.5t-54 93t-33 112.5t12 127t70 138.5q73 -25 127.5 -61.5t84.5 -76.5t48 -85t20.5 -89t-0.5 -85.5t-13 -76.5t-19 -62t-17 -42l-7 -15q1 -5 1 -50.5t-1 -71.5q3 7 10 18.5t30.5 43t50.5 58t71 55.5t91.5 44.5 t112 14.5t132.5 -24q-2 -78 -21.5 -141.5t-50 -104.5t-69.5 -71.5t-81.5 -45.5t-84.5 -24t-80 -9.5t-67.5 1t-46.5 4.5l-17 3q-23 -147 -73 -283q6 7 18 18.5t49.5 41t77.5 52.5t99.5 42t117.5 20t129 -23.5t137 -77.5z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1259 283v-66q0 -85 -57.5 -144.5t-138.5 -59.5h-57l-260 -269v269h-529q-81 0 -138.5 59.5t-57.5 144.5v66h1238zM1259 609v-255h-1238v255h1238zM1259 937v-255h-1238v255h1238zM1259 1077v-67h-1238v67q0 84 57.5 143.5t138.5 59.5h846q81 0 138.5 -59.5t57.5 -143.5z "/>
-   <glyph d="M1152 640q0 -14 -9 -23l-320 -320q-9 -9 -23 -9q-13 0 -22.5 9.5t-9.5 22.5v192h-352q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h352v192q0 14 9 23t23 9q12 0 24 -10l319 -319q9 -9 9 -23zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198 t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1152 736v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-352v-192q0 -14 -9 -23t-23 -9q-12 0 -24 10l-319 319q-9 9 -9 23t9 23l320 320q9 9 23 9q13 0 22.5 -9.5t9.5 -22.5v-192h352q13 0 22.5 -9.5t9.5 -22.5zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198 t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M1024 960v-640q0 -26 -19 -45t-45 -19q-20 0 -37 12l-448 320q-27 19 -27 52t27 52l448 320q17 12 37 12q26 0 45 -19t19 -45zM1280 160v960q0 13 -9.5 22.5t-22.5 9.5h-960q-13 0 -22.5 -9.5t-9.5 -22.5v-960q0 -13 9.5 -22.5t22.5 -9.5h960q13 0 22.5 9.5t9.5 22.5z M1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M1024 640q0 -106 -75 -181t-181 -75t-181 75t-75 181t75 181t181 75t181 -75t75 -181zM768 1184q-148 0 -273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273t-73 273t-198 198t-273 73zM1536 640q0 -209 -103 -385.5t-279.5 -279.5 t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1023 349l102 -204q-58 -179 -210 -290t-339 -111q-156 0 -288.5 77.5t-210 210t-77.5 288.5q0 181 104.5 330t274.5 211l17 -131q-122 -54 -195 -165.5t-73 -244.5q0 -185 131.5 -316.5t316.5 -131.5q126 0 232.5 65t165 175.5t49.5 236.5zM1571 249l58 -114l-256 -128 q-13 -7 -29 -7q-40 0 -57 35l-239 477h-472q-24 0 -42.5 16.5t-21.5 40.5l-96 779q-2 16 6 42q14 51 57 82.5t97 31.5q66 0 113 -47t47 -113q0 -69 -52 -117.5t-120 -41.5l37 -289h423v-128h-407l16 -128h455q40 0 57 -35l228 -455z"/>
-   <glyph d="M1292 898q10 216 -161 222q-231 8 -312 -261q44 19 82 19q85 0 74 -96q-4 -57 -74 -167t-105 -110q-43 0 -82 169q-13 54 -45 255q-30 189 -160 177q-59 -7 -164 -100l-81 -72l-81 -72l52 -67q76 52 87 52q57 0 107 -179q15 -55 45 -164.5t45 -164.5q68 -179 164 -179 q157 0 383 294q220 283 226 444zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1152" unicode="" d="M1152 704q0 -191 -94.5 -353t-256.5 -256.5t-353 -94.5h-160q-14 0 -23 9t-9 23v611l-215 -66q-3 -1 -9 -1q-10 0 -19 6q-13 10 -13 26v128q0 23 23 31l233 71v93l-215 -66q-3 -1 -9 -1q-10 0 -19 6q-13 10 -13 26v128q0 23 23 31l233 71v250q0 14 9 23t23 9h160 q14 0 23 -9t9 -23v-181l375 116q15 5 28 -5t13 -26v-128q0 -23 -23 -31l-393 -121v-93l375 116q15 5 28 -5t13 -26v-128q0 -23 -23 -31l-393 -121v-487q188 13 318 151t130 328q0 14 9 23t23 9h160q14 0 23 -9t9 -23z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M1152 736v-64q0 -14 -9 -23t-23 -9h-352v-352q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v352h-352q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h352v352q0 14 9 23t23 9h64q14 0 23 -9t9 -23v-352h352q14 0 23 -9t9 -23zM1280 288v832q0 66 -47 113t-113 47h-832 q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832q66 0 113 47t47 113zM1408 1120v-832q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h832q119 0 203.5 -84.5t84.5 -203.5z"/>
-   <glyph horiz-adv-x="2176" unicode="" d="M620 416q-110 -64 -268 -64h-128v64h-64q-13 0 -22.5 23.5t-9.5 56.5q0 24 7 49q-58 2 -96.5 10.5t-38.5 20.5t38.5 20.5t96.5 10.5q-7 25 -7 49q0 33 9.5 56.5t22.5 23.5h64v64h128q158 0 268 -64h1113q42 -7 106.5 -18t80.5 -14q89 -15 150 -40.5t83.5 -47.5t22.5 -40 t-22.5 -40t-83.5 -47.5t-150 -40.5q-16 -3 -80.5 -14t-106.5 -18h-1113zM1739 668q53 -36 53 -92t-53 -92l81 -30q68 48 68 122t-68 122zM625 400h1015q-217 -38 -456 -80q-57 0 -113 -24t-83 -48l-28 -24l-288 -288q-26 -26 -70.5 -45t-89.5 -19h-96l-93 464h29 q157 0 273 64zM352 816h-29l93 464h96q46 0 90 -19t70 -45l288 -288q4 -4 11 -10.5t30.5 -23t48.5 -29t61.5 -23t72.5 -10.5l456 -80h-1015q-116 64 -273 64z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1519 760q62 0 103.5 -40.5t41.5 -101.5q0 -97 -93 -130l-172 -59l56 -167q7 -21 7 -47q0 -59 -42 -102t-101 -43q-47 0 -85.5 27t-53.5 72l-55 165l-310 -106l55 -164q8 -24 8 -47q0 -59 -42 -102t-102 -43q-47 0 -85 27t-53 72l-55 163l-153 -53q-29 -9 -50 -9 q-61 0 -101.5 40t-40.5 101q0 47 27.5 85t71.5 53l156 53l-105 313l-156 -54q-26 -8 -48 -8q-60 0 -101 40.5t-41 100.5q0 47 27.5 85t71.5 53l157 53l-53 159q-8 24 -8 47q0 60 42 102.5t102 42.5q47 0 85 -27t53 -72l54 -160l310 105l-54 160q-8 24 -8 47q0 59 42.5 102 t101.5 43q47 0 85.5 -27.5t53.5 -71.5l53 -161l162 55q21 6 43 6q60 0 102.5 -39.5t42.5 -98.5q0 -45 -30 -81.5t-74 -51.5l-157 -54l105 -316l164 56q24 8 46 8zM725 498l310 105l-105 315l-310 -107z"/>
-   <glyph d="M1248 1408q119 0 203.5 -84.5t84.5 -203.5v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960zM1280 352v436q-31 -35 -64 -55q-34 -22 -132.5 -85t-151.5 -99q-98 -69 -164 -69v0v0q-66 0 -164 69 q-46 32 -141.5 92.5t-142.5 92.5q-12 8 -33 27t-31 27v-436q0 -40 28 -68t68 -28h832q40 0 68 28t28 68zM1280 925q0 41 -27.5 70t-68.5 29h-832q-40 0 -68 -28t-28 -68q0 -37 30.5 -76.5t67.5 -64.5q47 -32 137.5 -89t129.5 -83q3 -2 17 -11.5t21 -14t21 -13t23.5 -13 t21.5 -9.5t22.5 -7.5t20.5 -2.5t20.5 2.5t22.5 7.5t21.5 9.5t23.5 13t21 13t21 14t17 11.5l267 174q35 23 66.5 62.5t31.5 73.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M127 640q0 163 67 313l367 -1005q-196 95 -315 281t-119 411zM1415 679q0 -19 -2.5 -38.5t-10 -49.5t-11.5 -44t-17.5 -59t-17.5 -58l-76 -256l-278 826q46 3 88 8q19 2 26 18.5t-2.5 31t-28.5 13.5l-205 -10q-75 1 -202 10q-12 1 -20.5 -5t-11.5 -15t-1.5 -18.5t9 -16.5 t19.5 -8l80 -8l120 -328l-168 -504l-280 832q46 3 88 8q19 2 26 18.5t-2.5 31t-28.5 13.5l-205 -10q-7 0 -23 0.5t-26 0.5q105 160 274.5 253.5t367.5 93.5q147 0 280.5 -53t238.5 -149h-10q-55 0 -92 -40.5t-37 -95.5q0 -12 2 -24t4 -21.5t8 -23t9 -21t12 -22.5t12.5 -21 t14.5 -24t14 -23q63 -107 63 -212zM909 573l237 -647q1 -6 5 -11q-126 -44 -255 -44q-112 0 -217 32zM1570 1009q95 -174 95 -369q0 -209 -104 -385.5t-279 -278.5l235 678q59 169 59 276q0 42 -6 79zM896 1536q182 0 348 -71t286 -191t191 -286t71 -348t-71 -348t-191 -286 t-286 -191t-348 -71t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71zM896 -215q173 0 331.5 68t273 182.5t182.5 273t68 331.5t-68 331.5t-182.5 273t-273 182.5t-331.5 68t-331.5 -68t-273 -182.5t-182.5 -273t-68 -331.5t68 -331.5t182.5 -273 t273 -182.5t331.5 -68z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1086 1536v-1536l-272 -128q-228 20 -414 102t-293 208.5t-107 272.5q0 140 100.5 263.5t275 205.5t391.5 108v-172q-217 -38 -356.5 -150t-139.5 -255q0 -152 154.5 -267t388.5 -145v1360zM1755 954l37 -390l-525 114l147 83q-119 70 -280 99v172q277 -33 481 -157z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M960 1536l960 -384v-128h-128q0 -26 -20.5 -45t-48.5 -19h-1526q-28 0 -48.5 19t-20.5 45h-128v128zM256 896h256v-768h128v768h256v-768h128v768h256v-768h128v768h256v-768h59q28 0 48.5 -19t20.5 -45v-64h-1664v64q0 26 20.5 45t48.5 19h59v768zM1851 -64 q28 0 48.5 -19t20.5 -45v-128h-1920v128q0 26 20.5 45t48.5 19h1782z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1774 700l18 -316q4 -69 -82 -128t-235 -93.5t-323 -34.5t-323 34.5t-235 93.5t-82 128l18 316l574 -181q22 -7 48 -7t48 7zM2304 1024q0 -23 -22 -31l-1120 -352q-4 -1 -10 -1t-10 1l-652 206q-43 -34 -71 -111.5t-34 -178.5q63 -36 63 -109q0 -69 -58 -107l58 -433 q2 -14 -8 -25q-9 -11 -24 -11h-192q-15 0 -24 11q-10 11 -8 25l58 433q-58 38 -58 107q0 73 65 111q11 207 98 330l-333 104q-22 8 -22 31t22 31l1120 352q4 1 10 1t10 -1l1120 -352q22 -8 22 -31z"/>
-   <glyph d="M859 579l13 -707q-62 11 -105 11q-41 0 -105 -11l13 707q-40 69 -168.5 295.5t-216.5 374.5t-181 287q58 -15 108 -15q43 0 111 15q63 -111 133.5 -229.5t167 -276.5t138.5 -227q37 61 109.5 177.5t117.5 190t105 176t107 189.5q54 -14 107 -14q56 0 114 14v0 q-28 -39 -60 -88.5t-49.5 -78.5t-56.5 -96t-49 -84q-146 -248 -353 -610z" unicode=""/>
-   <glyph d="M768 750h725q12 -67 12 -128q0 -217 -91 -387.5t-259.5 -266.5t-386.5 -96q-157 0 -299 60.5t-245 163.5t-163.5 245t-60.5 299t60.5 299t163.5 245t245 163.5t299 60.5q300 0 515 -201l-209 -201q-123 119 -306 119q-129 0 -238.5 -65t-173.5 -176.5t-64 -243.5 t64 -243.5t173.5 -176.5t238.5 -65q87 0 160 24t120 60t82 82t51.5 87t22.5 78h-436v264z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1095 369q16 -16 0 -31q-62 -62 -199 -62t-199 62q-16 15 0 31q6 6 15 6t15 -6q48 -49 169 -49q120 0 169 49q6 6 15 6t15 -6zM788 550q0 -37 -26 -63t-63 -26t-63.5 26t-26.5 63q0 38 26.5 64t63.5 26t63 -26.5t26 -63.5zM1183 550q0 -37 -26.5 -63t-63.5 -26t-63 26 t-26 63t26 63.5t63 26.5t63.5 -26t26.5 -64zM1434 670q0 49 -35 84t-85 35t-86 -36q-130 90 -311 96l63 283l200 -45q0 -37 26 -63t63 -26t63.5 26.5t26.5 63.5t-26.5 63.5t-63.5 26.5q-54 0 -80 -50l-221 49q-19 5 -25 -16l-69 -312q-180 -7 -309 -97q-35 37 -87 37 q-50 0 -85 -35t-35 -84q0 -35 18.5 -64t49.5 -44q-6 -27 -6 -56q0 -142 140 -243t337 -101q198 0 338 101t140 243q0 32 -7 57q30 15 48 43.5t18 63.5zM1792 640q0 -182 -71 -348t-191 -286t-286 -191t-348 -71t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191 t348 71t348 -71t286 -191t191 -286t71 -348z"/>
-   <glyph d="M939 407q13 -13 0 -26q-53 -53 -171 -53t-171 53q-13 13 0 26q5 6 13 6t13 -6q42 -42 145 -42t145 42q5 6 13 6t13 -6zM676 563q0 -31 -23 -54t-54 -23t-54 23t-23 54q0 32 22.5 54.5t54.5 22.5t54.5 -22.5t22.5 -54.5zM1014 563q0 -31 -23 -54t-54 -23t-54 23t-23 54 q0 32 22.5 54.5t54.5 22.5t54.5 -22.5t22.5 -54.5zM1229 666q0 42 -30 72t-73 30q-42 0 -73 -31q-113 78 -267 82l54 243l171 -39q1 -32 23.5 -54t53.5 -22q32 0 54.5 22.5t22.5 54.5t-22.5 54.5t-54.5 22.5q-48 0 -69 -43l-189 42q-17 5 -21 -13l-60 -268q-154 -6 -265 -83 q-30 32 -74 32q-43 0 -73 -30t-30 -72q0 -30 16 -55t42 -38q-5 -25 -5 -48q0 -122 120 -208.5t289 -86.5q170 0 290 86.5t120 208.5q0 25 -6 49q25 13 40.5 37.5t15.5 54.5zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960 q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph d="M866 697l90 27v62q0 79 -58 135t-138 56t-138 -55.5t-58 -134.5v-283q0 -20 -14 -33.5t-33 -13.5t-32.5 13.5t-13.5 33.5v120h-151v-122q0 -82 57.5 -139t139.5 -57q81 0 138.5 56.5t57.5 136.5v280q0 19 13.5 33t33.5 14q19 0 32.5 -14t13.5 -33v-54zM1199 502v122h-150 v-126q0 -20 -13.5 -33.5t-33.5 -13.5q-19 0 -32.5 14t-13.5 33v123l-90 -26l-60 28v-123q0 -80 58 -137t139 -57t138.5 57t57.5 139zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103 t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1062 824v118q0 42 -30 72t-72 30t-72 -30t-30 -72v-612q0 -175 -126 -299t-303 -124q-178 0 -303.5 125.5t-125.5 303.5v266h328v-262q0 -43 30 -72.5t72 -29.5t72 29.5t30 72.5v620q0 171 126.5 292t301.5 121q176 0 302 -122t126 -294v-136l-195 -58zM1592 602h328 v-266q0 -178 -125.5 -303.5t-303.5 -125.5q-177 0 -303 124.5t-126 300.5v268l131 -61l195 58v-270q0 -42 30 -71.5t72 -29.5t72 29.5t30 71.5v275z"/>
-   <glyph d="M1472 160v480h-704v704h-480q-93 0 -158.5 -65.5t-65.5 -158.5v-480h704v-704h480q93 0 158.5 65.5t65.5 158.5zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5 t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="2048" unicode="" d="M328 1254h204v-983h-532v697h328v286zM328 435v369h-123v-369h123zM614 968v-697h205v697h-205zM614 1254v-204h205v204h-205zM901 968h533v-942h-533v163h328v82h-328v697zM1229 435v369h-123v-369h123zM1516 968h532v-942h-532v163h327v82h-327v697zM1843 435v369h-123 v-369h123z"/>
-   <glyph d="M1046 516q0 -64 -38 -109t-91 -45q-43 0 -70 15v277q28 17 70 17q53 0 91 -45.5t38 -109.5zM703 944q0 -64 -38 -109.5t-91 -45.5q-43 0 -70 15v277q28 17 70 17q53 0 91 -45t38 -109zM1265 513q0 134 -88 229t-213 95q-20 0 -39 -3q-23 -78 -78 -136q-87 -95 -211 -101 v-636l211 41v206q51 -19 117 -19q125 0 213 95t88 229zM922 940q0 134 -88.5 229t-213.5 95q-74 0 -141 -36h-186v-840l211 41v206q55 -19 116 -19q125 0 213.5 95t88.5 229zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960 q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="2038" unicode="" d="M1222 607q75 3 143.5 -20.5t118 -58.5t101 -94.5t84 -108t75.5 -120.5q33 -56 78.5 -109t75.5 -80.5t99 -88.5q-48 -30 -108.5 -57.5t-138.5 -59t-114 -47.5q-44 37 -74 115t-43.5 164.5t-33 180.5t-42.5 168.5t-72.5 123t-122.5 48.5l-10 -2l-6 -4q4 -5 13 -14 q6 -5 28 -23.5t25.5 -22t19 -18t18 -20.5t11.5 -21t10.5 -27.5t4.5 -31t4 -40.5l1 -33q1 -26 -2.5 -57.5t-7.5 -52t-12.5 -58.5t-11.5 -53q-35 1 -101 -9.5t-98 -10.5q-39 0 -72 10q-2 16 -2 47q0 74 3 96q2 13 31.5 41.5t57 59t26.5 51.5q-24 2 -43 -24 q-36 -53 -111.5 -99.5t-136.5 -46.5q-25 0 -75.5 63t-106.5 139.5t-84 96.5q-6 4 -27 30q-482 -112 -513 -112q-16 0 -28 11t-12 27q0 15 8.5 26.5t22.5 14.5l486 106q-8 14 -8 25t5.5 17.5t16 11.5t20 7t23 4.5t18.5 4.5q4 1 15.5 7.5t17.5 6.5q15 0 28 -16t20 -33 q163 37 172 37q17 0 29.5 -11t12.5 -28q0 -15 -8.5 -26t-23.5 -14l-182 -40l-1 -16q-1 -26 81.5 -117.5t104.5 -91.5q47 0 119 80t72 129q0 36 -23.5 53t-51 18.5t-51 11.5t-23.5 34q0 16 10 34l-68 19q43 44 43 117q0 26 -5 58q82 16 144 16q44 0 71.5 -1.5t48.5 -8.5 t31 -13.5t20.5 -24.5t15.5 -33.5t17 -47.5t24 -60l50 25q-3 -40 -23 -60t-42.5 -21t-40 -6.5t-16.5 -20.5zM1282 842q-5 5 -13.5 15.5t-12 14.5t-10.5 11.5t-10 10.5l-8 8t-8.5 7.5t-8 5t-8.5 4.5q-7 3 -14.5 5t-20.5 2.5t-22 0.5h-32.5h-37.5q-126 0 -217 -43 q16 30 36 46.5t54 29.5t65.5 36t46 36.5t50 55t43.5 50.5q12 -9 28 -31.5t32 -36.5t38 -13l12 1v-76l22 -1q247 95 371 190q28 21 50 39t42.5 37.5t33 31t29.5 34t24 31t24.5 37t23 38t27 47.5t29.5 53l7 9q-2 -53 -43 -139q-79 -165 -205 -264t-306 -142q-14 -3 -42 -7.5 t-50 -9.5t-39 -14q3 -19 24.5 -46t21.5 -34q0 -11 -26 -30zM1061 -79q39 26 131.5 47.5t146.5 21.5q9 0 22.5 -15.5t28 -42.5t26 -50t24 -51t14.5 -33q-121 -45 -244 -45q-61 0 -125 11zM822 568l48 12l109 -177l-73 -48zM1323 51q3 -15 3 -16q0 -7 -17.5 -14.5t-46 -13 t-54 -9.5t-53.5 -7.5t-32 -4.5l-7 43q21 2 60.5 8.5t72 10t60.5 3.5h14zM866 679l-96 -20l-6 17q10 1 32.5 7t34.5 6q19 0 35 -10zM1061 45h31l10 -83l-41 -12v95zM1950 1535v1v-1zM1950 1535l-1 -5l-2 -2l1 3zM1950 1535l1 1z"/>
-   <glyph d="M1167 -50q-5 19 -24 5q-30 -22 -87 -39t-131 -17q-129 0 -193 49q-5 4 -13 4q-11 0 -26 -12q-7 -6 -7.5 -16t7.5 -20q34 -32 87.5 -46t102.5 -12.5t99 4.5q41 4 84.5 20.5t65 30t28.5 20.5q12 12 7 29zM1128 65q-19 47 -39 61q-23 15 -76 15q-47 0 -71 -10 q-29 -12 -78 -56q-26 -24 -12 -44q9 -8 17.5 -4.5t31.5 23.5q3 2 10.5 8.5t10.5 8.5t10 7t11.5 7t12.5 5t15 4.5t16.5 2.5t20.5 1q27 0 44.5 -7.5t23 -14.5t13.5 -22q10 -17 12.5 -20t12.5 1q23 12 14 34zM1483 346q0 22 -5 44.5t-16.5 45t-34 36.5t-52.5 14 q-33 0 -97 -41.5t-129 -83.5t-101 -42q-27 -1 -63.5 19t-76 49t-83.5 58t-100 49t-111 19q-115 -1 -197 -78.5t-84 -178.5q-2 -112 74 -164q29 -20 62.5 -28.5t103.5 -8.5q57 0 132 32.5t134 71t120 70.5t93 31q26 -1 65 -31.5t71.5 -67t68 -67.5t55.5 -32q35 -3 58.5 14 t55.5 63q28 41 42.5 101t14.5 106zM1536 506q0 -164 -62 -304.5t-166 -236t-242.5 -149.5t-290.5 -54t-293 57.5t-247.5 157t-170.5 241.5t-64 302q0 89 19.5 172.5t49 145.5t70.5 118.5t78.5 94t78.5 69.5t64.5 46.5t42.5 24.5q14 8 51 26.5t54.5 28.5t48 30t60.5 44 q36 28 58 72.5t30 125.5q129 -155 186 -193q44 -29 130 -68t129 -66q21 -13 39 -25t60.5 -46.5t76 -70.5t75 -95t69 -122t47 -148.5t19.5 -177.5z" unicode=""/>
-   <glyph d="M1070 463l-160 -160l-151 -152l-30 -30q-65 -64 -151.5 -87t-171.5 -2q-16 -70 -72 -115t-129 -45q-85 0 -145 60.5t-60 145.5q0 72 44.5 128t113.5 72q-22 86 1 173t88 152l12 12l151 -152l-11 -11q-37 -37 -37 -89t37 -90q37 -37 89 -37t89 37l30 30l151 152l161 160z M729 1145l12 -12l-152 -152l-12 12q-37 37 -89 37t-89 -37t-37 -89.5t37 -89.5l29 -29l152 -152l160 -160l-151 -152l-161 160l-151 152l-30 30q-68 67 -90 159.5t5 179.5q-70 15 -115 71t-45 129q0 85 60 145.5t145 60.5q76 0 133.5 -49t69.5 -123q84 20 169.5 -3.5 t149.5 -87.5zM1536 78q0 -85 -60 -145.5t-145 -60.5q-74 0 -131 47t-71 118q-86 -28 -179.5 -6t-161.5 90l-11 12l151 152l12 -12q37 -37 89 -37t89 37t37 89t-37 89l-30 30l-152 152l-160 160l152 152l160 -160l152 -152l29 -30q64 -64 87.5 -150.5t2.5 -171.5 q76 -11 126.5 -68.5t50.5 -134.5zM1534 1202q0 -77 -51 -135t-127 -69q26 -85 3 -176.5t-90 -158.5l-12 -12l-151 152l12 12q37 37 37 89t-37 89t-89 37t-89 -37l-30 -30l-152 -152l-160 -160l-152 152l161 160l152 152l29 30q67 67 159 89.5t178 -3.5q11 75 68.5 126 t135.5 51q85 0 145 -60.5t60 -145.5z" unicode=""/>
-   <glyph d="M654 458q-1 -3 -12.5 0.5t-31.5 11.5l-20 9q-44 20 -87 49q-7 5 -41 31.5t-38 28.5q-67 -103 -134 -181q-81 -95 -105 -110q-4 -2 -19.5 -4t-18.5 0q6 4 82 92q21 24 85.5 115t78.5 118q17 30 51 98.5t36 77.5q-8 1 -110 -33q-8 -2 -27.5 -7.5t-34.5 -9.5t-17 -5 q-2 -2 -2 -10.5t-1 -9.5q-5 -10 -31 -15q-23 -7 -47 0q-18 4 -28 21q-4 6 -5 23q6 2 24.5 5t29.5 6q58 16 105 32q100 35 102 35q10 2 43 19.5t44 21.5q9 3 21.5 8t14.5 5.5t6 -0.5q2 -12 -1 -33q0 -2 -12.5 -27t-26.5 -53.5t-17 -33.5q-25 -50 -77 -131l64 -28 q12 -6 74.5 -32t67.5 -28q4 -1 10.5 -25.5t4.5 -30.5zM449 944q3 -15 -4 -28q-12 -23 -50 -38q-30 -12 -60 -12q-26 3 -49 26q-14 15 -18 41l1 3q3 -3 19.5 -5t26.5 0t58 16q36 12 55 14q17 0 21 -17zM1147 815l63 -227l-139 42zM39 15l694 232v1032l-694 -233v-1031z M1280 332l102 -31l-181 657l-100 31l-216 -536l102 -31l45 110l211 -65zM777 1294l573 -184v380zM1088 -29l158 -13l-54 -160l-40 66q-130 -83 -276 -108q-58 -12 -91 -12h-84q-79 0 -199.5 39t-183.5 85q-8 7 -8 16q0 8 5 13.5t13 5.5q4 0 18 -7.5t30.5 -16.5t20.5 -11 q73 -37 159.5 -61.5t157.5 -24.5q95 0 167 14.5t157 50.5q15 7 30.5 15.5t34 19t28.5 16.5zM1536 1050v-1079l-774 246q-14 -6 -375 -127.5t-368 -121.5q-13 0 -18 13q0 1 -1 3v1078q3 9 4 10q5 6 20 11q106 35 149 50v384l558 -198q2 0 160.5 55t316 108.5t161.5 53.5 q20 0 20 -21v-418z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M288 1152q66 0 113 -47t47 -113v-1088q0 -66 -47 -113t-113 -47h-128q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h128zM1664 989q58 -34 93 -93t35 -128v-768q0 -106 -75 -181t-181 -75h-864q-66 0 -113 47t-47 113v1536q0 40 28 68t68 28h672q40 0 88 -20t76 -48 l152 -152q28 -28 48 -76t20 -88v-163zM928 0v128q0 14 -9 23t-23 9h-128q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h128q14 0 23 9t9 23zM928 256v128q0 14 -9 23t-23 9h-128q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h128q14 0 23 9t9 23zM928 512v128q0 14 -9 23 t-23 9h-128q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h128q14 0 23 9t9 23zM1184 0v128q0 14 -9 23t-23 9h-128q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h128q14 0 23 9t9 23zM1184 256v128q0 14 -9 23t-23 9h-128q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h128 q14 0 23 9t9 23zM1184 512v128q0 14 -9 23t-23 9h-128q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h128q14 0 23 9t9 23zM1440 0v128q0 14 -9 23t-23 9h-128q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h128q14 0 23 9t9 23zM1440 256v128q0 14 -9 23t-23 9h-128 q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h128q14 0 23 9t9 23zM1440 512v128q0 14 -9 23t-23 9h-128q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h128q14 0 23 9t9 23zM1536 896v256h-160q-40 0 -68 28t-28 68v160h-640v-512h896z"/>
-   <glyph d="M1344 1536q26 0 45 -19t19 -45v-1664q0 -26 -19 -45t-45 -19h-1280q-26 0 -45 19t-19 45v1664q0 26 19 45t45 19h1280zM512 1248v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23zM512 992v-64q0 -14 9 -23t23 -9h64q14 0 23 9 t9 23v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23zM512 736v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23zM512 480v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23zM384 160v64 q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM384 416v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM384 672v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64 q14 0 23 9t9 23zM384 928v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM384 1184v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM896 -96v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9 t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM896 416v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM896 672v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM896 928v64 q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM896 1184v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1152 160v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64 q14 0 23 9t9 23zM1152 416v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1152 672v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1152 928v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9 t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1152 1184v64q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h64q14 0 23 9t9 23z" unicode=""/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1188 988l-292 -292v-824q0 -46 -33 -79t-79 -33t-79 33t-33 79v384h-64v-384q0 -46 -33 -79t-79 -33t-79 33t-33 79v824l-292 292q-28 28 -28 68t28 68t68 28t68 -28l228 -228h368l228 228q28 28 68 28t68 -28t28 -68t-28 -68zM864 1152q0 -93 -65.5 -158.5 t-158.5 -65.5t-158.5 65.5t-65.5 158.5t65.5 158.5t158.5 65.5t158.5 -65.5t65.5 -158.5z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M780 1064q0 -60 -19 -113.5t-63 -92.5t-105 -39q-76 0 -138 57.5t-92 135.5t-30 151q0 60 19 113.5t63 92.5t105 39q77 0 138.5 -57.5t91.5 -135t30 -151.5zM438 581q0 -80 -42 -139t-119 -59q-76 0 -141.5 55.5t-100.5 133.5t-35 152q0 80 42 139.5t119 59.5 q76 0 141.5 -55.5t100.5 -134t35 -152.5zM832 608q118 0 255 -97.5t229 -237t92 -254.5q0 -46 -17 -76.5t-48.5 -45t-64.5 -20t-76 -5.5q-68 0 -187.5 45t-182.5 45q-66 0 -192.5 -44.5t-200.5 -44.5q-183 0 -183 146q0 86 56 191.5t139.5 192.5t187.5 146t193 59zM1071 819 q-61 0 -105 39t-63 92.5t-19 113.5q0 74 30 151.5t91.5 135t138.5 57.5q61 0 105 -39t63 -92.5t19 -113.5q0 -73 -30 -151t-92 -135.5t-138 -57.5zM1503 923q77 0 119 -59.5t42 -139.5q0 -74 -35 -152t-100.5 -133.5t-141.5 -55.5q-77 0 -119 59t-42 139q0 74 35 152.5 t100.5 134t141.5 55.5z"/>
-   <glyph horiz-adv-x="768" unicode="" d="M704 1008q0 -145 -57 -243.5t-152 -135.5l45 -821q2 -26 -16 -45t-44 -19h-192q-26 0 -44 19t-16 45l45 821q-95 37 -152 135.5t-57 243.5q0 128 42.5 249.5t117.5 200t160 78.5t160 -78.5t117.5 -200t42.5 -249.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M896 -93l640 349v636l-640 -233v-752zM832 772l698 254l-698 254l-698 -254zM1664 1024v-768q0 -35 -18 -65t-49 -47l-704 -384q-28 -16 -61 -16t-61 16l-704 384q-31 17 -49 47t-18 65v768q0 40 23 73t61 47l704 256q22 8 44 8t44 -8l704 -256q38 -14 61 -47t23 -73z "/>
-   <glyph horiz-adv-x="2304" unicode="" d="M640 -96l384 192v314l-384 -164v-342zM576 358l404 173l-404 173l-404 -173zM1664 -96l384 192v314l-384 -164v-342zM1600 358l404 173l-404 173l-404 -173zM1152 651l384 165v266l-384 -164v-267zM1088 1030l441 189l-441 189l-441 -189zM2176 512v-416q0 -36 -19 -67 t-52 -47l-448 -224q-25 -14 -57 -14t-57 14l-448 224q-5 2 -7 4q-2 -2 -7 -4l-448 -224q-25 -14 -57 -14t-57 14l-448 224q-33 16 -52 47t-19 67v416q0 38 21.5 70t56.5 48l434 186v400q0 38 21.5 70t56.5 48l448 192q23 10 50 10t50 -10l448 -192q35 -16 56.5 -48t21.5 -70 v-400l434 -186q36 -16 57 -48t21 -70z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1848 1197h-511v-124h511v124zM1596 771q-90 0 -146 -52.5t-62 -142.5h408q-18 195 -200 195zM1612 186q63 0 122 32t76 87h221q-100 -307 -427 -307q-214 0 -340.5 132t-126.5 347q0 208 130.5 345.5t336.5 137.5q138 0 240.5 -68t153 -179t50.5 -248q0 -17 -2 -47h-658 q0 -111 57.5 -171.5t166.5 -60.5zM277 236h296q205 0 205 167q0 180 -199 180h-302v-347zM277 773h281q78 0 123.5 36.5t45.5 113.5q0 144 -190 144h-260v-294zM0 1282h594q87 0 155 -14t126.5 -47.5t90 -96.5t31.5 -154q0 -181 -172 -263q114 -32 172 -115t58 -204 q0 -75 -24.5 -136.5t-66 -103.5t-98.5 -71t-121 -42t-134 -13h-611v1260z"/>
-   <glyph d="M1248 1408q119 0 203.5 -84.5t84.5 -203.5v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960zM499 1041h-371v-787h382q117 0 197 57.5t80 170.5q0 158 -143 200q107 52 107 164q0 57 -19.5 96.5 t-56.5 60.5t-79 29.5t-97 8.5zM477 723h-176v184h163q119 0 119 -90q0 -94 -106 -94zM486 388h-185v217h189q124 0 124 -113q0 -104 -128 -104zM1136 356q-68 0 -104 38t-36 107h411q1 10 1 30q0 132 -74.5 220.5t-203.5 88.5q-128 0 -210 -86t-82 -216q0 -135 79 -217 t213 -82q205 0 267 191h-138q-11 -34 -47.5 -54t-75.5 -20zM1126 722q113 0 124 -122h-254q4 56 39 89t91 33zM964 988h319v-77h-319v77z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1582 954q0 -101 -71.5 -172.5t-172.5 -71.5t-172.5 71.5t-71.5 172.5t71.5 172.5t172.5 71.5t172.5 -71.5t71.5 -172.5zM812 212q0 104 -73 177t-177 73q-27 0 -54 -6l104 -42q77 -31 109.5 -106.5t1.5 -151.5q-31 -77 -107 -109t-152 -1q-21 8 -62 24.5t-61 24.5 q32 -60 91 -96.5t130 -36.5q104 0 177 73t73 177zM1642 953q0 126 -89.5 215.5t-215.5 89.5q-127 0 -216.5 -89.5t-89.5 -215.5q0 -127 89.5 -216t216.5 -89q126 0 215.5 89t89.5 216zM1792 953q0 -189 -133.5 -322t-321.5 -133l-437 -319q-12 -129 -109 -218t-229 -89 q-121 0 -214 76t-118 192l-230 92v429l389 -157q79 48 173 48q13 0 35 -2l284 407q2 187 135.5 319t320.5 132q188 0 321.5 -133.5t133.5 -321.5z"/>
-   <glyph d="M1242 889q0 80 -57 136.5t-137 56.5t-136.5 -57t-56.5 -136q0 -80 56.5 -136.5t136.5 -56.5t137 56.5t57 136.5zM632 301q0 -83 -58 -140.5t-140 -57.5q-56 0 -103 29t-72 77q52 -20 98 -40q60 -24 120 1.5t85 86.5q24 60 -1.5 120t-86.5 84l-82 33q22 5 42 5 q82 0 140 -57.5t58 -140.5zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v153l172 -69q20 -92 93.5 -152t168.5 -60q104 0 181 70t87 173l345 252q150 0 255.5 105.5t105.5 254.5q0 150 -105.5 255.5t-255.5 105.5 q-148 0 -253 -104.5t-107 -252.5l-225 -322q-9 1 -28 1q-75 0 -137 -37l-297 119v468q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5zM1289 887q0 -100 -71 -170.5t-171 -70.5t-170.5 70.5t-70.5 170.5t70.5 171t170.5 71q101 0 171.5 -70.5t70.5 -171.5z " unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M836 367l-15 -368l-2 -22l-420 29q-36 3 -67 31.5t-47 65.5q-11 27 -14.5 55t4 65t12 55t21.5 64t19 53q78 -12 509 -28zM449 953l180 -379l-147 92q-63 -72 -111.5 -144.5t-72.5 -125t-39.5 -94.5t-18.5 -63l-4 -21l-190 357q-17 26 -18 56t6 47l8 18q35 63 114 188 l-140 86zM1680 436l-188 -359q-12 -29 -36.5 -46.5t-43.5 -20.5l-18 -4q-71 -7 -219 -12l8 -164l-230 367l211 362l7 -173q170 -16 283 -5t170 33zM895 1360q-47 -63 -265 -435l-317 187l-19 12l225 356q20 31 60 45t80 10q24 -2 48.5 -12t42 -21t41.5 -33t36 -34.5 t36 -39.5t32 -35zM1550 1053l212 -363q18 -37 12.5 -76t-27.5 -74q-13 -20 -33 -37t-38 -28t-48.5 -22t-47 -16t-51.5 -14t-46 -12q-34 72 -265 436l313 195zM1407 1279l142 83l-220 -373l-419 20l151 86q-34 89 -75 166t-75.5 123.5t-64.5 80t-47 46.5l-17 13l405 -1 q31 3 58 -10.5t39 -28.5l11 -15q39 -61 112 -190z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M480 448q0 66 -47 113t-113 47t-113 -47t-47 -113t47 -113t113 -47t113 47t47 113zM516 768h1016l-89 357q-2 8 -14 17.5t-21 9.5h-768q-9 0 -21 -9.5t-14 -17.5zM1888 448q0 66 -47 113t-113 47t-113 -47t-47 -113t47 -113t113 -47t113 47t47 113zM2048 544v-384 q0 -14 -9 -23t-23 -9h-96v-128q0 -80 -56 -136t-136 -56t-136 56t-56 136v128h-1024v-128q0 -80 -56 -136t-136 -56t-136 56t-56 136v128h-96q-14 0 -23 9t-9 23v384q0 93 65.5 158.5t158.5 65.5h28l105 419q23 94 104 157.5t179 63.5h768q98 0 179 -63.5t104 -157.5 l105 -419h28q93 0 158.5 -65.5t65.5 -158.5z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1824 640q93 0 158.5 -65.5t65.5 -158.5v-384q0 -14 -9 -23t-23 -9h-96v-64q0 -80 -56 -136t-136 -56t-136 56t-56 136v64h-1024v-64q0 -80 -56 -136t-136 -56t-136 56t-56 136v64h-96q-14 0 -23 9t-9 23v384q0 93 65.5 158.5t158.5 65.5h28l105 419q23 94 104 157.5 t179 63.5h128v224q0 14 9 23t23 9h448q14 0 23 -9t9 -23v-224h128q98 0 179 -63.5t104 -157.5l105 -419h28zM320 160q66 0 113 47t47 113t-47 113t-113 47t-113 -47t-47 -113t47 -113t113 -47zM516 640h1016l-89 357q-2 8 -14 17.5t-21 9.5h-768q-9 0 -21 -9.5t-14 -17.5z M1728 160q66 0 113 47t47 113t-47 113t-113 47t-113 -47t-47 -113t47 -113t113 -47z"/>
-   <glyph d="M1504 64q0 -26 -19 -45t-45 -19h-462q1 -17 6 -87.5t5 -108.5q0 -25 -18 -42.5t-43 -17.5h-320q-25 0 -43 17.5t-18 42.5q0 38 5 108.5t6 87.5h-462q-26 0 -45 19t-19 45t19 45l402 403h-229q-26 0 -45 19t-19 45t19 45l402 403h-197q-26 0 -45 19t-19 45t19 45l384 384 q19 19 45 19t45 -19l384 -384q19 -19 19 -45t-19 -45t-45 -19h-197l402 -403q19 -19 19 -45t-19 -45t-45 -19h-229l402 -403q19 -19 19 -45z" unicode=""/>
-   <glyph d="M1127 326q0 32 -30 51q-193 115 -447 115q-133 0 -287 -34q-42 -9 -42 -52q0 -20 13.5 -34.5t35.5 -14.5q5 0 37 8q132 27 243 27q226 0 397 -103q19 -11 33 -11q19 0 33 13.5t14 34.5zM1223 541q0 40 -35 61q-237 141 -548 141q-153 0 -303 -42q-48 -13 -48 -64 q0 -25 17.5 -42.5t42.5 -17.5q7 0 37 8q122 33 251 33q279 0 488 -124q24 -13 38 -13q25 0 42.5 17.5t17.5 42.5zM1331 789q0 47 -40 70q-126 73 -293 110.5t-343 37.5q-204 0 -364 -47q-23 -7 -38.5 -25.5t-15.5 -48.5q0 -31 20.5 -52t51.5 -21q11 0 40 8q133 37 307 37 q159 0 309.5 -34t253.5 -95q21 -12 40 -12q29 0 50.5 20.5t21.5 51.5zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1024" unicode="" d="M1024 1233l-303 -582l24 -31h279v-415h-507l-44 -30l-142 -273l-30 -30h-301v303l303 583l-24 30h-279v415h507l44 30l142 273l30 30h301v-303z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M784 164l16 241l-16 523q-1 10 -7.5 17t-16.5 7q-9 0 -16 -7t-7 -17l-14 -523l14 -241q1 -10 7.5 -16.5t15.5 -6.5q22 0 24 23zM1080 193l11 211l-12 586q0 16 -13 24q-8 5 -16 5t-16 -5q-13 -8 -13 -24l-1 -6l-10 -579q0 -1 11 -236v-1q0 -10 6 -17q9 -11 23 -11 q11 0 20 9q9 7 9 20zM35 533l20 -128l-20 -126q-2 -9 -9 -9t-9 9l-17 126l17 128q2 9 9 9t9 -9zM121 612l26 -207l-26 -203q-2 -9 -10 -9q-9 0 -9 10l-23 202l23 207q0 9 9 9q8 0 10 -9zM401 159zM213 650l25 -245l-25 -237q0 -11 -11 -11q-10 0 -12 11l-21 237l21 245 q2 12 12 12q11 0 11 -12zM307 657l23 -252l-23 -244q-2 -13 -14 -13q-13 0 -13 13l-21 244l21 252q0 13 13 13q12 0 14 -13zM401 639l21 -234l-21 -246q-2 -16 -16 -16q-6 0 -10.5 4.5t-4.5 11.5l-20 246l20 234q0 6 4.5 10.5t10.5 4.5q14 0 16 -15zM784 164zM495 785 l21 -380l-21 -246q0 -7 -5 -12.5t-12 -5.5q-16 0 -18 18l-18 246l18 380q2 18 18 18q7 0 12 -5.5t5 -12.5zM589 871l19 -468l-19 -244q0 -8 -5.5 -13.5t-13.5 -5.5q-18 0 -20 19l-16 244l16 468q2 19 20 19q8 0 13.5 -5.5t5.5 -13.5zM687 911l18 -506l-18 -242 q-2 -21 -22 -21q-19 0 -21 21l-16 242l16 506q0 9 6.5 15.5t14.5 6.5q9 0 15 -6.5t7 -15.5zM1079 169v0v0zM881 915l15 -510l-15 -239q0 -10 -7.5 -17.5t-17.5 -7.5t-17 7t-8 18l-14 239l14 510q0 11 7.5 18t17.5 7t17.5 -7t7.5 -18zM980 896l14 -492l-14 -236q0 -11 -8 -19 t-19 -8t-19 8t-9 19l-12 236l12 492q1 12 9 20t19 8t18.5 -8t8.5 -20zM1192 404l-14 -231v0q0 -13 -9 -22t-22 -9t-22 9t-10 22l-6 114l-6 117l12 636v3q2 15 12 24q9 7 20 7q8 0 15 -5q14 -8 16 -26zM2304 423q0 -117 -83 -199.5t-200 -82.5h-786q-13 2 -22 11t-9 22v899 q0 23 28 33q85 34 181 34q195 0 338 -131.5t160 -323.5q53 22 110 22q117 0 200 -83t83 -201z"/>
-   <glyph d="M768 768q237 0 443 43t325 127v-170q0 -69 -103 -128t-280 -93.5t-385 -34.5t-385 34.5t-280 93.5t-103 128v170q119 -84 325 -127t443 -43zM768 0q237 0 443 43t325 127v-170q0 -69 -103 -128t-280 -93.5t-385 -34.5t-385 34.5t-280 93.5t-103 128v170q119 -84 325 -127 t443 -43zM768 384q237 0 443 43t325 127v-170q0 -69 -103 -128t-280 -93.5t-385 -34.5t-385 34.5t-280 93.5t-103 128v170q119 -84 325 -127t443 -43zM768 1536q208 0 385 -34.5t280 -93.5t103 -128v-128q0 -69 -103 -128t-280 -93.5t-385 -34.5t-385 34.5t-280 93.5 t-103 128v128q0 69 103 128t280 93.5t385 34.5z" unicode=""/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z M894 465q33 -26 84 -56q59 7 117 7q147 0 177 -49q16 -22 2 -52q0 -1 -1 -2l-2 -2v-1q-6 -38 -71 -38q-48 0 -115 20t-130 53q-221 -24 -392 -83q-153 -262 -242 -262q-15 0 -28 7l-24 12q-1 1 -6 5q-10 10 -6 36q9 40 56 91.5t132 96.5q14 9 23 -6q2 -2 2 -4q52 85 107 197 q68 136 104 262q-24 82 -30.5 159.5t6.5 127.5q11 40 42 40h21h1q23 0 35 -15q18 -21 9 -68q-2 -6 -4 -8q1 -3 1 -8v-30q-2 -123 -14 -192q55 -164 146 -238zM318 54q52 24 137 158q-51 -40 -87.5 -84t-49.5 -74zM716 974q-15 -42 -2 -132q1 7 7 44q0 3 7 43q1 4 4 8 q-1 1 -1 2t-0.5 1.5t-0.5 1.5q-1 22 -13 36q0 -1 -1 -2v-2zM592 313q135 54 284 81q-2 1 -13 9.5t-16 13.5q-76 67 -127 176q-27 -86 -83 -197q-30 -56 -45 -83zM1238 329q-24 24 -140 24q76 -28 124 -28q14 0 18 1q0 1 -2 3z" unicode=""/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z M233 768v-107h70l164 -661h159l128 485q7 20 10 46q2 16 2 24h4l3 -24q1 -3 3.5 -20t5.5 -26l128 -485h159l164 661h70v107h-300v-107h90l-99 -438q-5 -20 -7 -46l-2 -21h-4l-3 21q-1 5 -4 21t-5 25l-144 545h-114l-144 -545q-2 -9 -4.5 -24.5t-3.5 -21.5l-4 -21h-4l-2 21 q-2 26 -7 46l-99 438h90v107h-300z" unicode=""/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z M429 106v-106h281v106h-75l103 161q5 7 10 16.5t7.5 13.5t3.5 4h2q1 -4 5 -10q2 -4 4.5 -7.5t6 -8t6.5 -8.5l107 -161h-76v-106h291v106h-68l-192 273l195 282h67v107h-279v-107h74l-103 -159q-4 -7 -10 -16.5t-9 -13.5l-2 -3h-2q-1 4 -5 10q-6 11 -17 23l-106 159h76v107 h-290v-107h68l189 -272l-194 -283h-68z" unicode=""/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z M416 106v-106h327v106h-93v167h137q76 0 118 15q67 23 106.5 87t39.5 146q0 81 -37 141t-100 87q-48 19 -130 19h-368v-107h92v-555h-92zM769 386h-119v268h120q52 0 83 -18q56 -33 56 -115q0 -89 -62 -120q-31 -15 -78 -15z" unicode=""/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z M1280 320v-320h-1024v192l192 192l128 -128l384 384zM448 512q-80 0 -136 56t-56 136t56 136t136 56t136 -56t56 -136t-56 -136t-136 -56z" unicode=""/>
-   <glyph d="M640 1152v128h-128v-128h128zM768 1024v128h-128v-128h128zM640 896v128h-128v-128h128zM768 768v128h-128v-128h128zM1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400 v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-128v-128h-128v128h-512v-1536h1280zM781 593l107 -349q8 -27 8 -52q0 -83 -72.5 -137.5t-183.5 -54.5t-183.5 54.5t-72.5 137.5q0 25 8 52q21 63 120 396v128h128v-128h79 q22 0 39 -13t23 -34zM640 128q53 0 90.5 19t37.5 45t-37.5 45t-90.5 19t-90.5 -19t-37.5 -45t37.5 -45t90.5 -19z" unicode=""/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z M620 686q20 -8 20 -30v-544q0 -22 -20 -30q-8 -2 -12 -2q-12 0 -23 9l-166 167h-131q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h131l166 167q16 15 35 7zM1037 -3q31 0 50 24q129 159 129 363t-129 363q-16 21 -43 24t-47 -14q-21 -17 -23.5 -43.5t14.5 -47.5 q100 -123 100 -282t-100 -282q-17 -21 -14.5 -47.5t23.5 -42.5q18 -15 40 -15zM826 145q27 0 47 20q87 93 87 219t-87 219q-18 19 -45 20t-46 -17t-20 -44.5t18 -46.5q52 -57 52 -131t-52 -131q-19 -20 -18 -46.5t20 -44.5q20 -17 44 -17z" unicode=""/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z M768 768q52 0 90 -38t38 -90v-384q0 -52 -38 -90t-90 -38h-384q-52 0 -90 38t-38 90v384q0 52 38 90t90 38h384zM1260 766q20 -8 20 -30v-576q0 -22 -20 -30q-8 -2 -12 -2q-14 0 -23 9l-265 266v90l265 266q9 9 23 9q4 0 12 -2z" unicode=""/>
-   <glyph d="M1468 1156q28 -28 48 -76t20 -88v-1152q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1600q0 40 28 68t68 28h896q40 0 88 -20t76 -48zM1024 1400v-376h376q-10 29 -22 41l-313 313q-12 12 -41 22zM1408 -128v1024h-416q-40 0 -68 28t-28 68v416h-768v-1536h1280z M480 768q8 11 21 12.5t24 -6.5l51 -38q11 -8 12.5 -21t-6.5 -24l-182 -243l182 -243q8 -11 6.5 -24t-12.5 -21l-51 -38q-11 -8 -24 -6.5t-21 12.5l-226 301q-14 19 0 38zM1282 467q14 -19 0 -38l-226 -301q-8 -11 -21 -12.5t-24 6.5l-51 38q-11 8 -12.5 21t6.5 24l182 243 l-182 243q-8 11 -6.5 24t12.5 21l51 38q11 8 24 6.5t21 -12.5zM662 6q-13 2 -20.5 13t-5.5 24l138 831q2 13 13 20.5t24 5.5l63 -10q13 -2 20.5 -13t5.5 -24l-138 -831q-2 -13 -13 -20.5t-24 -5.5z" unicode=""/>
-   <glyph d="M1497 709v-198q-101 -23 -198 -23q-65 -136 -165.5 -271t-181.5 -215.5t-128 -106.5q-80 -45 -162 3q-28 17 -60.5 43.5t-85 83.5t-102.5 128.5t-107.5 184t-105.5 244t-91.5 314.5t-70.5 390h283q26 -218 70 -398.5t104.5 -317t121.5 -235.5t140 -195q169 169 287 406 q-142 72 -223 220t-81 333q0 192 104 314.5t284 122.5q178 0 273 -105.5t95 -297.5q0 -159 -58 -286q-7 -1 -19.5 -3t-46 -2t-63 6t-62 25.5t-50.5 51.5q31 103 31 184q0 87 -29 132t-79 45q-53 0 -85 -49.5t-32 -140.5q0 -186 105 -293.5t267 -107.5q62 0 121 14z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M216 367l603 -402v359l-334 223zM154 511l193 129l-193 129v-258zM973 -35l603 402l-269 180l-334 -223v-359zM896 458l272 182l-272 182l-272 -182zM485 733l334 223v359l-603 -402zM1445 640l193 -129v258zM1307 733l269 180l-603 402v-359zM1792 913v-546 q0 -41 -34 -64l-819 -546q-21 -13 -43 -13t-43 13l-819 546q-34 23 -34 64v546q0 41 34 64l819 546q21 13 43 13t43 -13l819 -546q34 -23 34 -64z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1800 764q111 -46 179.5 -145.5t68.5 -221.5q0 -164 -118 -280.5t-285 -116.5q-4 0 -11.5 0.5t-10.5 0.5h-1209h-1h-2h-5q-170 10 -288 125.5t-118 280.5q0 110 55 203t147 147q-12 39 -12 82q0 115 82 196t199 81q95 0 172 -58q75 154 222.5 248t326.5 94 q166 0 306 -80.5t221.5 -218.5t81.5 -301q0 -6 -0.5 -18t-0.5 -18zM468 498q0 -122 84 -193t208 -71q137 0 240 99q-16 20 -47.5 56.5t-43.5 50.5q-67 -65 -144 -65q-55 0 -93.5 33.5t-38.5 87.5q0 53 38.5 87t91.5 34q44 0 84.5 -21t73 -55t65 -75t69 -82t77 -75t97 -55 t121.5 -21q121 0 204.5 71.5t83.5 190.5q0 121 -84 192t-207 71q-143 0 -241 -97q14 -16 29.5 -34t34.5 -40t29 -34q66 64 142 64q52 0 92 -33t40 -84q0 -57 -37 -91.5t-94 -34.5q-43 0 -82.5 21t-72 55t-65.5 75t-69.5 82t-77.5 75t-96.5 55t-118.5 21q-122 0 -207 -70.5 t-85 -189.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M896 1536q182 0 348 -71t286 -191t191 -286t71 -348t-71 -348t-191 -286t-286 -191t-348 -71t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71zM896 1408q-190 0 -361 -90l194 -194q82 28 167 28t167 -28l194 194q-171 90 -361 90zM218 279l194 194 q-28 82 -28 167t28 167l-194 194q-90 -171 -90 -361t90 -361zM896 -128q190 0 361 90l-194 194q-82 -28 -167 -28t-167 28l-194 -194q171 -90 361 -90zM896 256q159 0 271.5 112.5t112.5 271.5t-112.5 271.5t-271.5 112.5t-271.5 -112.5t-112.5 -271.5t112.5 -271.5 t271.5 -112.5zM1380 473l194 -194q90 171 90 361t-90 361l-194 -194q28 -82 28 -167t-28 -167z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1760 640q0 -176 -68.5 -336t-184 -275.5t-275.5 -184t-336 -68.5t-336 68.5t-275.5 184t-184 275.5t-68.5 336q0 213 97 398.5t265 305.5t374 151v-228q-221 -45 -366.5 -221t-145.5 -406q0 -130 51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5 t136.5 204t51 248.5q0 230 -145.5 406t-366.5 221v228q206 -31 374 -151t265 -305.5t97 -398.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M19 662q8 217 116 406t305 318h5q0 -1 -1 -3q-8 -8 -28 -33.5t-52 -76.5t-60 -110.5t-44.5 -135.5t-14 -150.5t39 -157.5t108.5 -154q50 -50 102 -69.5t90.5 -11.5t69.5 23.5t47 32.5l16 16q39 51 53 116.5t6.5 122.5t-21 107t-26.5 80l-14 29q-10 25 -30.5 49.5t-43 41 t-43.5 29.5t-35 19l-13 6l104 115q39 -17 78 -52t59 -61l19 -27q1 48 -18.5 103.5t-40.5 87.5l-20 31l161 183l160 -181q-33 -46 -52.5 -102.5t-22.5 -90.5l-4 -33q22 37 61.5 72.5t67.5 52.5l28 17l103 -115q-44 -14 -85 -50t-60 -65l-19 -29q-31 -56 -48 -133.5t-7 -170 t57 -156.5q33 -45 77.5 -60.5t85 -5.5t76 26.5t57.5 33.5l21 16q60 53 96.5 115t48.5 121.5t10 121.5t-18 118t-37 107.5t-45.5 93t-45 72t-34.5 47.5l-13 17q-14 13 -7 13l10 -3q40 -29 62.5 -46t62 -50t64 -58t58.5 -65t55.5 -77t45.5 -88t38 -103t23.5 -117t10.5 -136 q3 -259 -108 -465t-312 -321t-456 -115q-185 0 -351 74t-283.5 198t-184 293t-60.5 353z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M874 -102v-66q-208 6 -385 109.5t-283 275.5l58 34q29 -49 73 -99l65 57q148 -168 368 -212l-17 -86q65 -12 121 -13zM276 428l-83 -28q22 -60 49 -112l-57 -33q-98 180 -98 385t98 385l57 -33q-30 -56 -49 -112l82 -28q-35 -100 -35 -212q0 -109 36 -212zM1528 251 l58 -34q-106 -172 -283 -275.5t-385 -109.5v66q56 1 121 13l-17 86q220 44 368 212l65 -57q44 50 73 99zM1377 805l-233 -80q14 -42 14 -85t-14 -85l232 -80q-31 -92 -98 -169l-185 162q-57 -67 -147 -85l48 -241q-52 -10 -98 -10t-98 10l48 241q-90 18 -147 85l-185 -162 q-67 77 -98 169l232 80q-14 42 -14 85t14 85l-233 80q33 93 99 169l185 -162q59 68 147 86l-48 240q44 10 98 10t98 -10l-48 -240q88 -18 147 -86l185 162q66 -76 99 -169zM874 1448v-66q-65 -2 -121 -13l17 -86q-220 -42 -368 -211l-65 56q-38 -42 -73 -98l-57 33 q106 172 282 275.5t385 109.5zM1705 640q0 -205 -98 -385l-57 33q27 52 49 112l-83 28q36 103 36 212q0 112 -35 212l82 28q-19 56 -49 112l57 33q98 -180 98 -385zM1585 1063l-57 -33q-35 56 -73 98l-65 -56q-148 169 -368 211l17 86q-56 11 -121 13v66q209 -6 385 -109.5 t282 -275.5zM1748 640q0 173 -67.5 331t-181.5 272t-272 181.5t-331 67.5t-331 -67.5t-272 -181.5t-181.5 -272t-67.5 -331t67.5 -331t181.5 -272t272 -181.5t331 -67.5t331 67.5t272 181.5t181.5 272t67.5 331zM1792 640q0 -182 -71 -348t-191 -286t-286 -191t-348 -71 t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71t348 -71t286 -191t191 -286t71 -348z"/>
-   <glyph d="M582 228q0 -66 -93 -66q-107 0 -107 63q0 64 98 64q102 0 102 -61zM546 694q0 -85 -74 -85q-77 0 -77 84q0 90 77 90q36 0 55 -25.5t19 -63.5zM712 769v125q-78 -29 -135 -29q-50 29 -110 29q-86 0 -145 -57t-59 -143q0 -50 29.5 -102t73.5 -67v-3q-38 -17 -38 -85 q0 -53 41 -77v-3q-113 -37 -113 -139q0 -45 20 -78.5t54 -51t72 -25.5t81 -8q224 0 224 188q0 67 -48 99t-126 46q-27 5 -51.5 20.5t-24.5 39.5q0 44 49 52q77 15 122 70t45 134q0 24 -10 52q37 9 49 13zM771 350h137q-2 27 -2 82v387q0 46 2 69h-137q3 -23 3 -71v-392 q0 -50 -3 -75zM1280 366v121q-30 -21 -68 -21q-53 0 -53 82v225h52q9 0 26.5 -1t26.5 -1v117h-105q0 82 3 102h-140q4 -24 4 -55v-47h-60v-117q36 3 37 3q3 0 11 -0.5t12 -0.5v-2h-2v-217q0 -37 2.5 -64t11.5 -56.5t24.5 -48.5t43.5 -31t66 -12q64 0 108 24zM924 1072 q0 36 -24 63.5t-60 27.5t-60.5 -27t-24.5 -64q0 -36 25 -62.5t60 -26.5t59.5 27t24.5 62zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M595 22q0 100 -165 100q-158 0 -158 -104q0 -101 172 -101q151 0 151 105zM536 777q0 61 -30 102t-89 41q-124 0 -124 -145q0 -135 124 -135q119 0 119 137zM805 1101v-202q-36 -12 -79 -22q16 -43 16 -84q0 -127 -73 -216.5t-197 -112.5q-40 -8 -59.5 -27t-19.5 -58 q0 -31 22.5 -51.5t58 -32t78.5 -22t86 -25.5t78.5 -37.5t58 -64t22.5 -98.5q0 -304 -363 -304q-69 0 -130 12.5t-116 41t-87.5 82t-32.5 127.5q0 165 182 225v4q-67 41 -67 126q0 109 63 137v4q-72 24 -119.5 108.5t-47.5 165.5q0 139 95 231.5t235 92.5q96 0 178 -47 q98 0 218 47zM1123 220h-222q4 45 4 134v609q0 94 -4 128h222q-4 -33 -4 -124v-613q0 -89 4 -134zM1724 442v-196q-71 -39 -174 -39q-62 0 -107 20t-70 50t-39.5 78t-18.5 92t-4 103v351h2v4q-7 0 -19 1t-18 1q-21 0 -59 -6v190h96v76q0 54 -6 89h227q-6 -41 -6 -165h171 v-190q-15 0 -43.5 2t-42.5 2h-85v-365q0 -131 87 -131q61 0 109 33zM1148 1389q0 -58 -39 -101.5t-96 -43.5q-58 0 -98 43.5t-40 101.5q0 59 39.5 103t98.5 44q58 0 96.5 -44.5t38.5 -102.5z"/>
-   <glyph d="M809 532l266 499h-112l-157 -312q-24 -48 -44 -92l-42 92l-155 312h-120l263 -493v-324h101v318zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1280" unicode="" d="M842 964q0 -80 -57 -136.5t-136 -56.5q-60 0 -111 35q-62 -67 -115 -146q-247 -371 -202 -859q1 -22 -12.5 -38.5t-34.5 -18.5h-5q-20 0 -35 13.5t-17 33.5q-14 126 -3.5 247.5t29.5 217t54 186t69 155.5t74 125q61 90 132 165q-16 35 -16 77q0 80 56.5 136.5t136.5 56.5 t136.5 -56.5t56.5 -136.5zM1223 953q0 -158 -78 -292t-212.5 -212t-292.5 -78q-64 0 -131 14q-21 5 -32.5 23.5t-6.5 39.5q5 20 23 31.5t39 7.5q51 -13 108 -13q97 0 186 38t153 102t102 153t38 186t-38 186t-102 153t-153 102t-186 38t-186 -38t-153 -102t-102 -153 t-38 -186q0 -114 52 -218q10 -20 3.5 -40t-25.5 -30t-39.5 -3t-30.5 26q-64 123 -64 265q0 119 46.5 227t124.5 186t186 124t226 46q158 0 292.5 -78t212.5 -212.5t78 -292.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M270 730q-8 19 -8 52q0 20 11 49t24 45q-1 22 7.5 53t22.5 43q0 139 92.5 288.5t217.5 209.5q139 66 324 66q133 0 266 -55q49 -21 90 -48t71 -56t55 -68t42 -74t32.5 -84.5t25.5 -89.5t22 -98l1 -5q55 -83 55 -150q0 -14 -9 -40t-9 -38q0 -1 1.5 -3.5t3.5 -5t2 -3.5 q77 -114 120.5 -214.5t43.5 -208.5q0 -43 -19.5 -100t-55.5 -57q-9 0 -19.5 7.5t-19 17.5t-19 26t-16 26.5t-13.5 26t-9 17.5q-1 1 -3 1l-5 -4q-59 -154 -132 -223q20 -20 61.5 -38.5t69 -41.5t35.5 -65q-2 -4 -4 -16t-7 -18q-64 -97 -302 -97q-53 0 -110.5 9t-98 20 t-104.5 30q-15 5 -23 7q-14 4 -46 4.5t-40 1.5q-41 -45 -127.5 -65t-168.5 -20q-35 0 -69 1.5t-93 9t-101 20.5t-74.5 40t-32.5 64q0 40 10 59.5t41 48.5q11 2 40.5 13t49.5 12q4 0 14 2q2 2 2 4l-2 3q-48 11 -108 105.5t-73 156.5l-5 3q-4 0 -12 -20q-18 -41 -54.5 -74.5 t-77.5 -37.5h-1q-4 0 -6 4.5t-5 5.5q-23 54 -23 100q0 275 252 466z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M580 1075q0 41 -25 66t-66 25q-43 0 -76 -25.5t-33 -65.5q0 -39 33 -64.5t76 -25.5q41 0 66 24.5t25 65.5zM1323 568q0 28 -25.5 50t-65.5 22q-27 0 -49.5 -22.5t-22.5 -49.5q0 -28 22.5 -50.5t49.5 -22.5q40 0 65.5 22t25.5 51zM1087 1075q0 41 -24.5 66t-65.5 25 q-43 0 -76 -25.5t-33 -65.5q0 -39 33 -64.5t76 -25.5q41 0 65.5 24.5t24.5 65.5zM1722 568q0 28 -26 50t-65 22q-27 0 -49.5 -22.5t-22.5 -49.5q0 -28 22.5 -50.5t49.5 -22.5q39 0 65 22t26 51zM1456 965q-31 4 -70 4q-169 0 -311 -77t-223.5 -208.5t-81.5 -287.5 q0 -78 23 -152q-35 -3 -68 -3q-26 0 -50 1.5t-55 6.5t-44.5 7t-54.5 10.5t-50 10.5l-253 -127l72 218q-290 203 -290 490q0 169 97.5 311t264 223.5t363.5 81.5q176 0 332.5 -66t262 -182.5t136.5 -260.5zM2048 404q0 -117 -68.5 -223.5t-185.5 -193.5l55 -181l-199 109 q-150 -37 -218 -37q-169 0 -311 70.5t-223.5 191.5t-81.5 264t81.5 264t223.5 191.5t311 70.5q161 0 303 -70.5t227.5 -192t85.5 -263.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1764 1525q33 -24 27 -64l-256 -1536q-5 -29 -32 -45q-14 -8 -31 -8q-11 0 -24 5l-453 185l-242 -295q-18 -23 -49 -23q-13 0 -22 4q-19 7 -30.5 23.5t-11.5 36.5v349l864 1059l-1069 -925l-395 162q-37 14 -40 55q-2 40 32 59l1664 960q15 9 32 9q20 0 36 -11z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1764 1525q33 -24 27 -64l-256 -1536q-5 -29 -32 -45q-14 -8 -31 -8q-11 0 -24 5l-527 215l-298 -327q-18 -21 -47 -21q-14 0 -23 4q-19 7 -30 23.5t-11 36.5v452l-472 193q-37 14 -40 55q-3 39 32 59l1664 960q35 21 68 -2zM1422 26l221 1323l-1434 -827l336 -137 l863 639l-478 -797z"/>
-   <glyph d="M1536 640q0 -156 -61 -298t-164 -245t-245 -164t-298 -61q-172 0 -327 72.5t-264 204.5q-7 10 -6.5 22.5t8.5 20.5l137 138q10 9 25 9q16 -2 23 -12q73 -95 179 -147t225 -52q104 0 198.5 40.5t163.5 109.5t109.5 163.5t40.5 198.5t-40.5 198.5t-109.5 163.5 t-163.5 109.5t-198.5 40.5q-98 0 -188 -35.5t-160 -101.5l137 -138q31 -30 14 -69q-17 -40 -59 -40h-448q-26 0 -45 19t-19 45v448q0 42 40 59q39 17 69 -14l130 -129q107 101 244.5 156.5t284.5 55.5q156 0 298 -61t245 -164t164 -245t61 -298zM896 928v-448q0 -14 -9 -23 t-23 -9h-320q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h224v352q0 14 9 23t23 9h64q14 0 23 -9t9 -23z" unicode=""/>
-   <glyph d="M768 1280q-130 0 -248.5 -51t-204 -136.5t-136.5 -204t-51 -248.5t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5t-51 248.5t-136.5 204t-204 136.5t-248.5 51zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103 t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1682 -128q-44 0 -132.5 3.5t-133.5 3.5q-44 0 -132 -3.5t-132 -3.5q-24 0 -37 20.5t-13 45.5q0 31 17 46t39 17t51 7t45 15q33 21 33 140l-1 391q0 21 -1 31q-13 4 -50 4h-675q-38 0 -51 -4q-1 -10 -1 -31l-1 -371q0 -142 37 -164q16 -10 48 -13t57 -3.5t45 -15 t20 -45.5q0 -26 -12.5 -48t-36.5 -22q-47 0 -139.5 3.5t-138.5 3.5q-43 0 -128 -3.5t-127 -3.5q-23 0 -35.5 21t-12.5 45q0 30 15.5 45t36 17.5t47.5 7.5t42 15q33 23 33 143l-1 57v813q0 3 0.5 26t0 36.5t-1.5 38.5t-3.5 42t-6.5 36.5t-11 31.5t-16 18q-15 10 -45 12t-53 2 t-41 14t-18 45q0 26 12 48t36 22q46 0 138.5 -3.5t138.5 -3.5q42 0 126.5 3.5t126.5 3.5q25 0 37.5 -22t12.5 -48q0 -30 -17 -43.5t-38.5 -14.5t-49.5 -4t-43 -13q-35 -21 -35 -160l1 -320q0 -21 1 -32q13 -3 39 -3h699q25 0 38 3q1 11 1 32l1 320q0 139 -35 160 q-18 11 -58.5 12.5t-66 13t-25.5 49.5q0 26 12.5 48t37.5 22q44 0 132 -3.5t132 -3.5q43 0 129 3.5t129 3.5q25 0 37.5 -22t12.5 -48q0 -30 -17.5 -44t-40 -14.5t-51.5 -3t-44 -12.5q-35 -23 -35 -161l1 -943q0 -119 34 -140q16 -10 46 -13.5t53.5 -4.5t41.5 -15.5t18 -44.5 q0 -26 -12 -48t-36 -22z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1278 1347v-73q0 -29 -18.5 -61t-42.5 -32q-50 0 -54 -1q-26 -6 -32 -31q-3 -11 -3 -64v-1152q0 -25 -18 -43t-43 -18h-108q-25 0 -43 18t-18 43v1218h-143v-1218q0 -25 -17.5 -43t-43.5 -18h-108q-26 0 -43.5 18t-17.5 43v496q-147 12 -245 59q-126 58 -192 179 q-64 117 -64 259q0 166 88 286q88 118 209 159q111 37 417 37h479q25 0 43 -18t18 -43z"/>
-   <glyph d="M352 128v-128h-352v128h352zM704 256q26 0 45 -19t19 -45v-256q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h256zM864 640v-128h-864v128h864zM224 1152v-128h-224v128h224zM1536 128v-128h-736v128h736zM576 1280q26 0 45 -19t19 -45v-256 q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h256zM1216 768q26 0 45 -19t19 -45v-256q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h256zM1536 640v-128h-224v128h224zM1536 1152v-128h-864v128h864z" unicode=""/>
-   <glyph d="M1216 512q133 0 226.5 -93.5t93.5 -226.5t-93.5 -226.5t-226.5 -93.5t-226.5 93.5t-93.5 226.5q0 12 2 34l-360 180q-92 -86 -218 -86q-133 0 -226.5 93.5t-93.5 226.5t93.5 226.5t226.5 93.5q126 0 218 -86l360 180q-2 22 -2 34q0 133 93.5 226.5t226.5 93.5 t226.5 -93.5t93.5 -226.5t-93.5 -226.5t-226.5 -93.5q-126 0 -218 86l-360 -180q2 -22 2 -34t-2 -34l360 -180q92 86 218 86z" unicode=""/>
-   <glyph d="M1280 341q0 88 -62.5 151t-150.5 63q-84 0 -145 -58l-241 120q2 16 2 23t-2 23l241 120q61 -58 145 -58q88 0 150.5 63t62.5 151t-62.5 150.5t-150.5 62.5t-151 -62.5t-63 -150.5q0 -7 2 -23l-241 -120q-62 57 -145 57q-88 0 -150.5 -62.5t-62.5 -150.5t62.5 -150.5 t150.5 -62.5q83 0 145 57l241 -120q-2 -16 -2 -23q0 -88 63 -150.5t151 -62.5t150.5 62.5t62.5 150.5zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M571 947q-10 25 -34 35t-49 0q-108 -44 -191 -127t-127 -191q-10 -25 0 -49t35 -34q13 -5 24 -5q42 0 60 40q34 84 98.5 148.5t148.5 98.5q25 11 35 35t0 49zM1513 1303l46 -46l-244 -243l68 -68q19 -19 19 -45.5t-19 -45.5l-64 -64q89 -161 89 -343q0 -143 -55.5 -273.5 t-150 -225t-225 -150t-273.5 -55.5t-273.5 55.5t-225 150t-150 225t-55.5 273.5t55.5 273.5t150 225t225 150t273.5 55.5q182 0 343 -89l64 64q19 19 45.5 19t45.5 -19l68 -68zM1521 1359q-10 -10 -22 -10q-13 0 -23 10l-91 90q-9 10 -9 23t9 23q10 9 23 9t23 -9l90 -91 q10 -9 10 -22.5t-10 -22.5zM1751 1129q-11 -9 -23 -9t-23 9l-90 91q-10 9 -10 22.5t10 22.5q9 10 22.5 10t22.5 -10l91 -90q9 -10 9 -23t-9 -23zM1792 1312q0 -14 -9 -23t-23 -9h-96q-14 0 -23 9t-9 23t9 23t23 9h96q14 0 23 -9t9 -23zM1600 1504v-96q0 -14 -9 -23t-23 -9 t-23 9t-9 23v96q0 14 9 23t23 9t23 -9t9 -23zM1751 1449l-91 -90q-10 -10 -22 -10q-13 0 -23 10q-10 9 -10 22.5t10 22.5l90 91q10 9 23 9t23 -9q9 -10 9 -23t-9 -23z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M609 720l287 208l287 -208l-109 -336h-355zM896 1536q182 0 348 -71t286 -191t191 -286t71 -348t-71 -348t-191 -286t-286 -191t-348 -71t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71zM1515 186q149 203 149 454v3l-102 -89l-240 224l63 323 l134 -12q-150 206 -389 282l53 -124l-287 -159l-287 159l53 124q-239 -76 -389 -282l135 12l62 -323l-240 -224l-102 89v-3q0 -251 149 -454l30 132l326 -40l139 -298l-116 -69q117 -39 240 -39t240 39l-116 69l139 298l326 40z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M448 224v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM256 608v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM832 224v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23 v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM640 608v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM66 768q-28 0 -47 19t-19 46v129h514v-129q0 -27 -19 -46t-46 -19h-383zM1216 224v-192q0 -14 -9 -23t-23 -9h-192 q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1024 608v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1600 224v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23 zM1408 608v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1792 1016v-13h-514v10q0 104 -382 102q-382 -1 -382 -102v-10h-514v13q0 17 8.5 43t34 64t65.5 75.5t110.5 76t160 67.5t224 47.5t293.5 18.5t293 -18.5t224 -47.5 t160.5 -67.5t110.5 -76t65.5 -75.5t34 -64t8.5 -43zM1792 608v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1792 962v-129q0 -27 -19 -46t-46 -19h-384q-27 0 -46 19t-19 46v129h514z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M704 1216v-768q0 -26 -19 -45t-45 -19v-576q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v512l249 873q7 23 31 23h424zM1024 1216v-704h-256v704h256zM1792 320v-512q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v576q-26 0 -45 19t-19 45v768h424q24 0 31 -23z M736 1504v-224h-352v224q0 14 9 23t23 9h288q14 0 23 -9t9 -23zM1408 1504v-224h-352v224q0 14 9 23t23 9h288q14 0 23 -9t9 -23z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1755 1083q37 -37 37 -90t-37 -91l-401 -400l150 -150l-160 -160q-163 -163 -389.5 -186.5t-411.5 100.5l-362 -362h-181v181l362 362q-124 185 -100.5 411.5t186.5 389.5l160 160l150 -150l400 401q38 37 91 37t90 -37t37 -90.5t-37 -90.5l-400 -401l234 -234l401 400 q38 37 91 37t90 -37z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M873 796q0 -83 -63.5 -142.5t-152.5 -59.5t-152.5 59.5t-63.5 142.5q0 84 63.5 143t152.5 59t152.5 -59t63.5 -143zM1375 796q0 -83 -63 -142.5t-153 -59.5q-89 0 -152.5 59.5t-63.5 142.5q0 84 63.5 143t152.5 59q90 0 153 -59t63 -143zM1600 616v667q0 87 -32 123.5 t-111 36.5h-1112q-83 0 -112.5 -34t-29.5 -126v-673q43 -23 88.5 -40t81 -28t81 -18.5t71 -11t70 -4t58.5 -0.5t56.5 2t44.5 2q68 1 95 -27q6 -6 10 -9q26 -25 61 -51q7 91 118 87q5 0 36.5 -1.5t43 -2t45.5 -1t53 1t54.5 4.5t61 8.5t62 13.5t67 19.5t67.5 27t72 34.5z M1763 621q-121 -149 -372 -252q84 -285 -23 -465q-66 -113 -183 -148q-104 -32 -182 15q-86 51 -82 164l-1 326v1q-8 2 -24.5 6t-23.5 5l-1 -338q4 -114 -83 -164q-79 -47 -183 -15q-117 36 -182 150q-105 180 -22 463q-251 103 -372 252q-25 37 -4 63t60 -1q3 -2 11 -7 t11 -8v694q0 72 47 123t114 51h1257q67 0 114 -51t47 -123v-694l21 15q39 27 60 1t-4 -63z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M896 1102v-434h-145v434h145zM1294 1102v-434h-145v434h145zM1294 342l253 254v795h-1194v-1049h326v-217l217 217h398zM1692 1536v-1013l-434 -434h-326l-217 -217h-217v217h-398v1158l109 289h1483z"/>
-   <glyph d="M773 217v-127q-1 -292 -6 -305q-12 -32 -51 -40q-54 -9 -181.5 38t-162.5 89q-13 15 -17 36q-1 12 4 26q4 10 34 47t181 216q1 0 60 70q15 19 39.5 24.5t49.5 -3.5q24 -10 37.5 -29t12.5 -42zM624 468q-3 -55 -52 -70l-120 -39q-275 -88 -292 -88q-35 2 -54 36 q-12 25 -17 75q-8 76 1 166.5t30 124.5t56 32q13 0 202 -77q70 -29 115 -47l84 -34q23 -9 35.5 -30.5t11.5 -48.5zM1450 171q-7 -54 -91.5 -161t-135.5 -127q-37 -14 -63 7q-14 10 -184 287l-47 77q-14 21 -11.5 46t19.5 46q35 43 83 26q1 -1 119 -40q203 -66 242 -79.5 t47 -20.5q28 -22 22 -61zM778 803q5 -102 -54 -122q-58 -17 -114 71l-378 598q-8 35 19 62q41 43 207.5 89.5t224.5 31.5q40 -10 49 -45q3 -18 22 -305.5t24 -379.5zM1440 695q3 -39 -26 -59q-15 -10 -329 -86q-67 -15 -91 -23l1 2q-23 -6 -46 4t-37 32q-30 47 0 87 q1 1 75 102q125 171 150 204t34 39q28 19 65 2q48 -23 123 -133.5t81 -167.5v-3z" unicode=""/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1024 1024h-384v-384h384v384zM1152 384v-128h-640v128h640zM1152 1152v-640h-640v640h640zM1792 384v-128h-512v128h512zM1792 640v-128h-512v128h512zM1792 896v-128h-512v128h512zM1792 1152v-128h-512v128h512zM256 192v960h-128v-960q0 -26 19 -45t45 -19t45 19 t19 45zM1920 192v1088h-1536v-1088q0 -33 -11 -64h1483q26 0 45 19t19 45zM2048 1408v-1216q0 -80 -56 -136t-136 -56h-1664q-80 0 -136 56t-56 136v1088h256v128h1792z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1024 13q-20 0 -93 73.5t-73 93.5q0 32 62.5 54t103.5 22t103.5 -22t62.5 -54q0 -20 -73 -93.5t-93 -73.5zM1294 284q-2 0 -40 25t-101.5 50t-128.5 25t-128.5 -25t-101 -50t-40.5 -25q-18 0 -93.5 75t-75.5 93q0 13 10 23q78 77 196 121t233 44t233 -44t196 -121 q10 -10 10 -23q0 -18 -75.5 -93t-93.5 -75zM1567 556q-11 0 -23 8q-136 105 -252 154.5t-268 49.5q-85 0 -170.5 -22t-149 -53t-113.5 -62t-79 -53t-31 -22q-17 0 -92 75t-75 93q0 12 10 22q132 132 320 205t380 73t380 -73t320 -205q10 -10 10 -22q0 -18 -75 -93t-92 -75z M1838 827q-11 0 -22 9q-179 157 -371.5 236.5t-420.5 79.5t-420.5 -79.5t-371.5 -236.5q-11 -9 -22 -9q-17 0 -92.5 75t-75.5 93q0 13 10 23q187 186 445 288t527 102t527 -102t445 -288q10 -10 10 -23q0 -18 -75.5 -93t-92.5 -75z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M384 0q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM768 0q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM384 384q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5 t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1152 0q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM768 384q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5 t37.5 90.5zM384 768q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1152 384q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM768 768q0 53 -37.5 90.5t-90.5 37.5 t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1536 0v384q0 52 -38 90t-90 38t-90 -38t-38 -90v-384q0 -52 38 -90t90 -38t90 38t38 90zM1152 768q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5z M1536 1088v256q0 26 -19 45t-45 19h-1280q-26 0 -45 -19t-19 -45v-256q0 -26 19 -45t45 -19h1280q26 0 45 19t19 45zM1536 768q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1664 1408v-1536q0 -52 -38 -90t-90 -38 h-1408q-52 0 -90 38t-38 90v1536q0 52 38 90t90 38h1408q52 0 90 -38t38 -90z"/>
-   <glyph d="M1519 890q18 -84 -4 -204q-87 -444 -565 -444h-44q-25 0 -44 -16.5t-24 -42.5l-4 -19l-55 -346l-2 -15q-5 -26 -24.5 -42.5t-44.5 -16.5h-251q-21 0 -33 15t-9 36q9 56 26.5 168t26.5 168t27 167.5t27 167.5q5 37 43 37h131q133 -2 236 21q175 39 287 144q102 95 155 246 q24 70 35 133q1 6 2.5 7.5t3.5 1t6 -3.5q79 -59 98 -162zM1347 1172q0 -107 -46 -236q-80 -233 -302 -315q-113 -40 -252 -42q0 -1 -90 -1l-90 1q-100 0 -118 -96q-2 -8 -85 -530q-1 -10 -12 -10h-295q-22 0 -36.5 16.5t-11.5 38.5l232 1471q5 29 27.5 48t51.5 19h598 q34 0 97.5 -13t111.5 -32q107 -41 163.5 -123t56.5 -196z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M602 949q19 -61 31 -123.5t17 -141.5t-14 -159t-62 -145q-21 81 -67 157t-95.5 127t-99 90.5t-78.5 57.5t-33 19q-62 34 -81.5 100t14.5 128t101 81.5t129 -14.5q138 -83 238 -177zM927 1236q11 -25 20.5 -46t36.5 -100.5t42.5 -150.5t25.5 -179.5t0 -205.5t-47.5 -209.5 t-105.5 -208.5q-51 -72 -138 -72q-54 0 -98 31q-57 40 -69 109t28 127q60 85 81 195t13 199.5t-32 180.5t-39 128t-22 52q-31 63 -8.5 129.5t85.5 97.5q34 17 75 17q47 0 88.5 -25t63.5 -69zM1248 567q-17 -160 -72 -311q-17 131 -63 246q25 174 -5 361q-27 178 -94 342 q114 -90 212 -211q9 -37 15 -80q26 -179 7 -347zM1520 1440q9 -17 23.5 -49.5t43.5 -117.5t50.5 -178t34 -227.5t5 -269t-47 -300t-112.5 -323.5q-22 -48 -66 -75.5t-95 -27.5q-39 0 -74 16q-67 31 -92.5 100t4.5 136q58 126 90 257.5t37.5 239.5t-3.5 213.5t-26.5 180.5 t-38.5 138.5t-32.5 90t-15.5 32.5q-34 65 -11.5 135.5t87.5 104.5q37 20 81 20q49 0 91.5 -25.5t66.5 -70.5z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1975 546h-138q14 37 66 179l3 9q4 10 10 26t9 26l12 -55zM531 611l-58 295q-11 54 -75 54h-268l-2 -13q311 -79 403 -336zM710 960l-162 -438l-17 89q-26 70 -85 129.5t-131 88.5l135 -510h175l261 641h-176zM849 318h166l104 642h-166zM1617 944q-69 27 -149 27 q-123 0 -201 -59t-79 -153q-1 -102 145 -174q48 -23 67 -41t19 -39q0 -30 -30 -46t-69 -16q-86 0 -156 33l-22 11l-23 -144q74 -34 185 -34q130 -1 208.5 59t80.5 160q0 106 -140 174q-49 25 -71 42t-22 38q0 22 24.5 38.5t70.5 16.5q70 1 124 -24l15 -8zM2042 960h-128 q-65 0 -87 -54l-246 -588h174l35 96h212q5 -22 20 -96h154zM2304 1280v-1280q0 -52 -38 -90t-90 -38h-2048q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h2048q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M671 603h-13q-47 0 -47 -32q0 -22 20 -22q17 0 28 15t12 39zM1066 639h62v3q1 4 0.5 6.5t-1 7t-2 8t-4.5 6.5t-7.5 5t-11.5 2q-28 0 -36 -38zM1606 603h-12q-48 0 -48 -32q0 -22 20 -22q17 0 28 15t12 39zM1925 629q0 41 -30 41q-19 0 -31 -20t-12 -51q0 -42 28 -42 q20 0 32.5 20t12.5 52zM480 770h87l-44 -262h-56l32 201l-71 -201h-39l-4 200l-34 -200h-53l44 262h81l2 -163zM733 663q0 -6 -4 -42q-16 -101 -17 -113h-47l1 22q-20 -26 -58 -26q-23 0 -37.5 16t-14.5 42q0 39 26 60.5t73 21.5q14 0 23 -1q0 3 0.5 5.5t1 4.5t0.5 3 q0 20 -36 20q-29 0 -59 -10q0 4 7 48q38 11 67 11q74 0 74 -62zM889 721l-8 -49q-22 3 -41 3q-27 0 -27 -17q0 -8 4.5 -12t21.5 -11q40 -19 40 -60q0 -72 -87 -71q-34 0 -58 6q0 2 7 49q29 -8 51 -8q32 0 32 19q0 7 -4.5 11.5t-21.5 12.5q-43 20 -43 59q0 72 84 72 q30 0 50 -4zM977 721h28l-7 -52h-29q-2 -17 -6.5 -40.5t-7 -38.5t-2.5 -18q0 -16 19 -16q8 0 16 2l-8 -47q-21 -7 -40 -7q-43 0 -45 47q0 12 8 56q3 20 25 146h55zM1180 648q0 -23 -7 -52h-111q-3 -22 10 -33t38 -11q30 0 58 14l-9 -54q-30 -8 -57 -8q-95 0 -95 95 q0 55 27.5 90.5t69.5 35.5q35 0 55.5 -21t20.5 -56zM1319 722q-13 -23 -22 -62q-22 2 -31 -24t-25 -128h-56l3 14q22 130 29 199h51l-3 -33q14 21 25.5 29.5t28.5 4.5zM1506 763l-9 -57q-28 14 -50 14q-31 0 -51 -27.5t-20 -70.5q0 -30 13.5 -47t38.5 -17q21 0 48 13 l-10 -59q-28 -8 -50 -8q-45 0 -71.5 30.5t-26.5 82.5q0 70 35.5 114.5t91.5 44.5q26 0 61 -13zM1668 663q0 -18 -4 -42q-13 -79 -17 -113h-46l1 22q-20 -26 -59 -26q-23 0 -37 16t-14 42q0 39 25.5 60.5t72.5 21.5q15 0 23 -1q2 7 2 13q0 20 -36 20q-29 0 -59 -10q0 4 8 48 q38 11 67 11q73 0 73 -62zM1809 722q-14 -24 -21 -62q-23 2 -31.5 -23t-25.5 -129h-56l3 14q19 104 29 199h52q0 -11 -4 -33q15 21 26.5 29.5t27.5 4.5zM1950 770h56l-43 -262h-53l3 19q-23 -23 -52 -23q-31 0 -49.5 24t-18.5 64q0 53 27.5 92t64.5 39q31 0 53 -29z M2061 640q0 148 -72.5 273t-198 198t-273.5 73q-181 0 -328 -110q127 -116 171 -284h-50q-44 150 -158 253q-114 -103 -158 -253h-50q44 168 171 284q-147 110 -328 110q-148 0 -273.5 -73t-198 -198t-72.5 -273t72.5 -273t198 -198t273.5 -73q181 0 328 110 q-120 111 -165 264h50q46 -138 152 -233q106 95 152 233h50q-45 -153 -165 -264q147 -110 328 -110q148 0 273.5 73t198 198t72.5 273zM2304 1280v-1280q0 -52 -38 -90t-90 -38h-2048q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h2048q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M313 759q0 -51 -36 -84q-29 -26 -89 -26h-17v220h17q61 0 89 -27q36 -31 36 -83zM2089 824q0 -52 -64 -52h-19v101h20q63 0 63 -49zM380 759q0 74 -50 120.5t-129 46.5h-95v-333h95q74 0 119 38q60 51 60 128zM410 593h65v333h-65v-333zM730 694q0 40 -20.5 62t-75.5 42 q-29 10 -39.5 19t-10.5 23q0 16 13.5 26.5t34.5 10.5q29 0 53 -27l34 44q-41 37 -98 37q-44 0 -74 -27.5t-30 -67.5q0 -35 18 -55.5t64 -36.5q37 -13 45 -19q19 -12 19 -34q0 -20 -14 -33.5t-36 -13.5q-48 0 -71 44l-42 -40q44 -64 115 -64q51 0 83 30.5t32 79.5zM1008 604 v77q-37 -37 -78 -37q-49 0 -80.5 32.5t-31.5 82.5q0 48 31.5 81.5t77.5 33.5q43 0 81 -38v77q-40 20 -80 20q-74 0 -125.5 -50.5t-51.5 -123.5t51 -123.5t125 -50.5q42 0 81 19zM2240 0v527q-65 -40 -144.5 -84t-237.5 -117t-329.5 -137.5t-417.5 -134.5t-504 -118h1569 q26 0 45 19t19 45zM1389 757q0 75 -53 128t-128 53t-128 -53t-53 -128t53 -128t128 -53t128 53t53 128zM1541 584l144 342h-71l-90 -224l-89 224h-71l142 -342h35zM1714 593h184v56h-119v90h115v56h-115v74h119v57h-184v-333zM2105 593h80l-105 140q76 16 76 94q0 47 -31 73 t-87 26h-97v-333h65v133h9zM2304 1274v-1268q0 -56 -38.5 -95t-93.5 -39h-2040q-55 0 -93.5 39t-38.5 95v1268q0 56 38.5 95t93.5 39h2040q55 0 93.5 -39t38.5 -95z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M119 854h89l-45 108zM740 328l74 79l-70 79h-163v-49h142v-55h-142v-54h159zM898 406l99 -110v217zM1186 453q0 33 -40 33h-84v-69h83q41 0 41 36zM1475 457q0 29 -42 29h-82v-61h81q43 0 43 32zM1197 923q0 29 -42 29h-82v-60h81q43 0 43 31zM1656 854h89l-44 108z M699 1009v-271h-66v212l-94 -212h-57l-94 212v-212h-132l-25 60h-135l-25 -60h-70l116 271h96l110 -257v257h106l85 -184l77 184h108zM1255 453q0 -20 -5.5 -35t-14 -25t-22.5 -16.5t-26 -10t-31.5 -4.5t-31.5 -1t-32.5 0.5t-29.5 0.5v-91h-126l-80 90l-83 -90h-256v271h260 l80 -89l82 89h207q109 0 109 -89zM964 794v-56h-217v271h217v-57h-152v-49h148v-55h-148v-54h152zM2304 235v-229q0 -55 -38.5 -94.5t-93.5 -39.5h-2040q-55 0 -93.5 39.5t-38.5 94.5v678h111l25 61h55l25 -61h218v46l19 -46h113l20 47v-47h541v99l10 1q10 0 10 -14v-86h279 v23q23 -12 55 -18t52.5 -6.5t63 0.5t51.5 1l25 61h56l25 -61h227v58l34 -58h182v378h-180v-44l-25 44h-185v-44l-23 44h-249q-69 0 -109 -22v22h-172v-22q-24 22 -73 22h-628l-43 -97l-43 97h-198v-44l-22 44h-169l-78 -179v391q0 55 38.5 94.5t93.5 39.5h2040 q55 0 93.5 -39.5t38.5 -94.5v-678h-120q-51 0 -81 -22v22h-177q-55 0 -78 -22v22h-316v-22q-31 22 -87 22h-209v-22q-23 22 -91 22h-234l-54 -58l-50 58h-349v-378h343l55 59l52 -59h211v89h21q59 0 90 13v-102h174v99h8q8 0 10 -2t2 -10v-87h529q57 0 88 24v-24h168 q60 0 95 17zM1546 469q0 -23 -12 -43t-34 -29q25 -9 34 -26t9 -46v-54h-65v45q0 33 -12 43.5t-46 10.5h-69v-99h-65v271h154q48 0 77 -15t29 -58zM1269 936q0 -24 -12.5 -44t-33.5 -29q26 -9 34.5 -25.5t8.5 -46.5v-53h-65q0 9 0.5 26.5t0 25t-3 18.5t-8.5 16t-17.5 8.5 t-29.5 3.5h-70v-98h-64v271l153 -1q49 0 78 -14.5t29 -57.5zM1798 327v-56h-216v271h216v-56h-151v-49h148v-55h-148v-54zM1372 1009v-271h-66v271h66zM2065 357q0 -86 -102 -86h-126v58h126q34 0 34 25q0 16 -17 21t-41.5 5t-49.5 3.5t-42 22.5t-17 55q0 39 26 60t66 21 h130v-57h-119q-36 0 -36 -25q0 -16 17.5 -20.5t42 -4t49 -2.5t42 -21.5t17.5 -54.5zM2304 407v-101q-24 -35 -88 -35h-125v58h125q33 0 33 25q0 13 -12.5 19t-31 5.5t-40 2t-40 8t-31 24t-12.5 48.5q0 39 26.5 60t66.5 21h129v-57h-118q-36 0 -36 -25q0 -20 29 -22t68.5 -5 t56.5 -26zM2139 1008v-270h-92l-122 203v-203h-132l-26 60h-134l-25 -60h-75q-129 0 -129 133q0 138 133 138h63v-59q-7 0 -28 1t-28.5 0.5t-23 -2t-21.5 -6.5t-14.5 -13.5t-11.5 -23t-3 -33.5q0 -38 13.5 -58t49.5 -20h29l92 213h97l109 -256v256h99l114 -188v188h66z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M745 630q0 -37 -25.5 -61.5t-62.5 -24.5q-29 0 -46.5 16t-17.5 44q0 37 25 62.5t62 25.5q28 0 46.5 -16.5t18.5 -45.5zM1530 779q0 -42 -22 -57t-66 -15l-32 -1l17 107q2 11 13 11h18q22 0 35 -2t25 -12.5t12 -30.5zM1881 630q0 -36 -25.5 -61t-61.5 -25q-29 0 -47 16 t-18 44q0 37 25 62.5t62 25.5q28 0 46.5 -16.5t18.5 -45.5zM513 801q0 59 -38.5 85.5t-100.5 26.5h-160q-19 0 -21 -19l-65 -408q-1 -6 3 -11t10 -5h76q20 0 22 19l18 110q1 8 7 13t15 6.5t17 1.5t19 -1t14 -1q86 0 135 48.5t49 134.5zM822 489l41 261q1 6 -3 11t-10 5h-76 q-14 0 -17 -33q-27 40 -95 40q-72 0 -122.5 -54t-50.5 -127q0 -59 34.5 -94t92.5 -35q28 0 58 12t48 32q-4 -12 -4 -21q0 -16 13 -16h69q19 0 22 19zM1269 752q0 5 -4 9.5t-9 4.5h-77q-11 0 -18 -10l-106 -156l-44 150q-5 16 -22 16h-75q-5 0 -9 -4.5t-4 -9.5q0 -2 19.5 -59 t42 -123t23.5 -70q-82 -112 -82 -120q0 -13 13 -13h77q11 0 18 10l255 368q2 2 2 7zM1649 801q0 59 -38.5 85.5t-100.5 26.5h-159q-20 0 -22 -19l-65 -408q-1 -6 3 -11t10 -5h82q12 0 16 13l18 116q1 8 7 13t15 6.5t17 1.5t19 -1t14 -1q86 0 135 48.5t49 134.5zM1958 489 l41 261q1 6 -3 11t-10 5h-76q-14 0 -17 -33q-26 40 -95 40q-72 0 -122.5 -54t-50.5 -127q0 -59 34.5 -94t92.5 -35q29 0 59 12t47 32q0 -1 -2 -9t-2 -12q0 -16 13 -16h69q19 0 22 19zM2176 898v1q0 14 -13 14h-74q-11 0 -13 -11l-65 -416l-1 -2q0 -5 4 -9.5t10 -4.5h66 q19 0 21 19zM392 764q-5 -35 -26 -46t-60 -11l-33 -1l17 107q2 11 13 11h19q40 0 58 -11.5t12 -48.5zM2304 1280v-1280q0 -52 -38 -90t-90 -38h-2048q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h2048q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1597 633q0 -69 -21 -106q-19 -35 -52 -35q-23 0 -41 9v224q29 30 57 30q57 0 57 -122zM2035 669h-110q6 98 56 98q51 0 54 -98zM476 534q0 59 -33 91.5t-101 57.5q-36 13 -52 24t-16 25q0 26 38 26q58 0 124 -33l18 112q-67 32 -149 32q-77 0 -123 -38q-48 -39 -48 -109 q0 -58 32.5 -90.5t99.5 -56.5q39 -14 54.5 -25.5t15.5 -27.5q0 -31 -48 -31q-29 0 -70 12.5t-72 30.5l-18 -113q72 -41 168 -41q81 0 129 37q51 41 51 117zM771 749l19 111h-96v135l-129 -21l-18 -114l-46 -8l-17 -103h62v-219q0 -84 44 -120q38 -30 111 -30q32 0 79 11v118 q-32 -7 -44 -7q-42 0 -42 50v197h77zM1087 724v139q-15 3 -28 3q-32 0 -55.5 -16t-33.5 -46l-10 56h-131v-471h150v306q26 31 82 31q16 0 26 -2zM1124 389h150v471h-150v-471zM1746 638q0 122 -45 179q-40 52 -111 52q-64 0 -117 -56l-8 47h-132v-645l150 25v151 q36 -11 68 -11q83 0 134 56q61 65 61 202zM1278 986q0 33 -23 56t-56 23t-56 -23t-23 -56t23 -56.5t56 -23.5t56 23.5t23 56.5zM2176 629q0 113 -48 176q-50 64 -144 64q-96 0 -151.5 -66t-55.5 -180q0 -128 63 -188q55 -55 161 -55q101 0 160 40l-16 103q-57 -31 -128 -31 q-43 0 -63 19q-23 19 -28 66h248q2 14 2 52zM2304 1280v-1280q0 -52 -38 -90t-90 -38h-2048q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h2048q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1558 684q61 -356 298 -556q0 -52 -38 -90t-90 -38h-448q0 -106 -75 -181t-181 -75t-180.5 74.5t-75.5 180.5zM1024 -176q16 0 16 16t-16 16q-59 0 -101.5 42.5t-42.5 101.5q0 16 -16 16t-16 -16q0 -73 51.5 -124.5t124.5 -51.5zM2026 1424q8 -10 7.5 -23.5t-10.5 -22.5 l-1872 -1622q-10 -8 -23.5 -7t-21.5 11l-84 96q-8 10 -7.5 23.5t10.5 21.5l186 161q-19 32 -19 66q50 42 91 88t85 119.5t74.5 158.5t50 206t19.5 260q0 152 117 282.5t307 158.5q-8 19 -8 39q0 40 28 68t68 28t68 -28t28 -68q0 -20 -8 -39q124 -18 219 -82.5t148 -157.5 l418 363q10 8 23.5 7t21.5 -11z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1040 -160q0 16 -16 16q-59 0 -101.5 42.5t-42.5 101.5q0 16 -16 16t-16 -16q0 -73 51.5 -124.5t124.5 -51.5q16 0 16 16zM503 315l877 760q-42 88 -132.5 146.5t-223.5 58.5q-93 0 -169.5 -31.5t-121.5 -80.5t-69 -103t-24 -105q0 -384 -137 -645zM1856 128 q0 -52 -38 -90t-90 -38h-448q0 -106 -75 -181t-181 -75t-180.5 74.5t-75.5 180.5l149 129h757q-166 187 -227 459l111 97q61 -356 298 -556zM1942 1520l84 -96q8 -10 7.5 -23.5t-10.5 -22.5l-1872 -1622q-10 -8 -23.5 -7t-21.5 11l-84 96q-8 10 -7.5 23.5t10.5 21.5l186 161 q-19 32 -19 66q50 42 91 88t85 119.5t74.5 158.5t50 206t19.5 260q0 152 117 282.5t307 158.5q-8 19 -8 39q0 40 28 68t68 28t68 -28t28 -68q0 -20 -8 -39q124 -18 219 -82.5t148 -157.5l418 363q10 8 23.5 7t21.5 -11z"/>
-   <glyph horiz-adv-x="1408" unicode="" d="M512 160v704q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-704q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM768 160v704q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-704q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1024 160v704q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-704 q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM480 1152h448l-48 117q-7 9 -17 11h-317q-10 -2 -17 -11zM1408 1120v-64q0 -14 -9 -23t-23 -9h-96v-948q0 -83 -47 -143.5t-113 -60.5h-832q-66 0 -113 58.5t-47 141.5v952h-96q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h309l70 167 q15 37 54 63t79 26h320q40 0 79 -26t54 -63l70 -167h309q14 0 23 -9t9 -23z"/>
-   <glyph d="M1150 462v-109q0 -50 -36.5 -89t-94 -60.5t-118 -32.5t-117.5 -11q-205 0 -342.5 139t-137.5 346q0 203 136 339t339 136q34 0 75.5 -4.5t93 -18t92.5 -34t69 -56.5t28 -81v-109q0 -16 -16 -16h-118q-16 0 -16 16v70q0 43 -65.5 67.5t-137.5 24.5q-140 0 -228.5 -91.5 t-88.5 -237.5q0 -151 91.5 -249.5t233.5 -98.5q68 0 138 24t70 66v70q0 7 4.5 11.5t10.5 4.5h119q6 0 11 -4.5t5 -11.5zM768 1280q-130 0 -248.5 -51t-204 -136.5t-136.5 -204t-51 -248.5t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5 t-51 248.5t-136.5 204t-204 136.5t-248.5 51zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M972 761q0 108 -53.5 169t-147.5 61q-63 0 -124 -30.5t-110 -84.5t-79.5 -137t-30.5 -180q0 -112 53.5 -173t150.5 -61q96 0 176 66.5t122.5 166t42.5 203.5zM1536 640q0 -111 -37 -197t-98.5 -135t-131.5 -74.5t-145 -27.5q-6 0 -15.5 -0.5t-16.5 -0.5q-95 0 -142 53 q-28 33 -33 83q-52 -66 -131.5 -110t-173.5 -44q-161 0 -249.5 95.5t-88.5 269.5q0 157 66 290t179 210.5t246 77.5q87 0 155 -35.5t106 -99.5l2 19l11 56q1 6 5.5 12t9.5 6h118q5 0 13 -11q5 -5 3 -16l-120 -614q-5 -24 -5 -48q0 -39 12.5 -52t44.5 -13q28 1 57 5.5t73 24 t77 50t57 89.5t24 137q0 292 -174 466t-466 174q-130 0 -248.5 -51t-204 -136.5t-136.5 -204t-51 -248.5t51 -248.5t136.5 -204t204 -136.5t248.5 -51q228 0 405 144q11 9 24 8t21 -12l41 -49q8 -12 7 -24q-2 -13 -12 -22q-102 -83 -227.5 -128t-258.5 -45q-156 0 -298 61 t-245 164t-164 245t-61 298t61 298t164 245t245 164t298 61q344 0 556 -212t212 -556z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1698 1442q94 -94 94 -226.5t-94 -225.5l-225 -223l104 -104q10 -10 10 -23t-10 -23l-210 -210q-10 -10 -23 -10t-23 10l-105 105l-603 -603q-37 -37 -90 -37h-203l-256 -128l-64 64l128 256v203q0 53 37 90l603 603l-105 105q-10 10 -10 23t10 23l210 210q10 10 23 10 t23 -10l104 -104l223 225q93 94 225.5 94t226.5 -94zM512 64l576 576l-192 192l-576 -576v-192h192z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1615 1536q70 0 122.5 -46.5t52.5 -116.5q0 -63 -45 -151q-332 -629 -465 -752q-97 -91 -218 -91q-126 0 -216.5 92.5t-90.5 219.5q0 128 92 212l638 579q59 54 130 54zM706 502q39 -76 106.5 -130t150.5 -76l1 -71q4 -213 -129.5 -347t-348.5 -134q-123 0 -218 46.5 t-152.5 127.5t-86.5 183t-29 220q7 -5 41 -30t62 -44.5t59 -36.5t46 -17q41 0 55 37q25 66 57.5 112.5t69.5 76t88 47.5t103 25.5t125 10.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 128v-384h-1792v384q45 0 85 14t59 27.5t47 37.5q30 27 51.5 38t56.5 11t55.5 -11t52.5 -38q29 -25 47 -38t58 -27t86 -14q45 0 85 14.5t58 27t48 37.5q21 19 32.5 27t31 15t43.5 7q35 0 56.5 -11t51.5 -38q28 -24 47 -37.5t59 -27.5t85 -14t85 14t59 27.5t47 37.5 q30 27 51.5 38t56.5 11q34 0 55.5 -11t51.5 -38q28 -24 47 -37.5t59 -27.5t85 -14zM1792 448v-192q-35 0 -55.5 11t-52.5 38q-29 25 -47 38t-58 27t-85 14q-46 0 -86 -14t-58 -27t-47 -38q-22 -19 -33 -27t-31 -15t-44 -7q-35 0 -56.5 11t-51.5 38q-29 25 -47 38t-58 27 t-86 14q-45 0 -85 -14.5t-58 -27t-48 -37.5q-21 -19 -32.5 -27t-31 -15t-43.5 -7q-35 0 -56.5 11t-51.5 38q-28 24 -47 37.5t-59 27.5t-85 14q-46 0 -86 -14t-58 -27t-47 -38q-30 -27 -51.5 -38t-56.5 -11v192q0 80 56 136t136 56h64v448h256v-448h256v448h256v-448h256v448 h256v-448h64q80 0 136 -56t56 -136zM512 1312q0 -77 -36 -118.5t-92 -41.5q-53 0 -90.5 37.5t-37.5 90.5q0 29 9.5 51t23.5 34t31 28t31 31.5t23.5 44.5t9.5 67q38 0 83 -74t45 -150zM1024 1312q0 -77 -36 -118.5t-92 -41.5q-53 0 -90.5 37.5t-37.5 90.5q0 29 9.5 51 t23.5 34t31 28t31 31.5t23.5 44.5t9.5 67q38 0 83 -74t45 -150zM1536 1312q0 -77 -36 -118.5t-92 -41.5q-53 0 -90.5 37.5t-37.5 90.5q0 29 9.5 51t23.5 34t31 28t31 31.5t23.5 44.5t9.5 67q38 0 83 -74t45 -150z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M2048 0v-128h-2048v1536h128v-1408h1920zM1664 1024l256 -896h-1664v576l448 576l576 -576z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M768 646l546 -546q-106 -108 -247.5 -168t-298.5 -60q-209 0 -385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103v-762zM955 640h773q0 -157 -60 -298.5t-168 -247.5zM1664 768h-768v768q209 0 385.5 -103t279.5 -279.5t103 -385.5z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M2048 0v-128h-2048v1536h128v-1408h1920zM1920 1248v-435q0 -21 -19.5 -29.5t-35.5 7.5l-121 121l-633 -633q-10 -10 -23 -10t-23 10l-233 233l-416 -416l-192 192l585 585q10 10 23 10t23 -10l233 -233l464 464l-121 121q-16 16 -7.5 35.5t29.5 19.5h435q14 0 23 -9 t9 -23z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1292 832q0 -6 10 -41q10 -29 25 -49.5t41 -34t44 -20t55 -16.5q325 -91 325 -332q0 -146 -105.5 -242.5t-254.5 -96.5q-59 0 -111.5 18.5t-91.5 45.5t-77 74.5t-63 87.5t-53.5 103.5t-43.5 103t-39.5 106.5t-35.5 95q-32 81 -61.5 133.5t-73.5 96.5t-104 64t-142 20 q-96 0 -183 -55.5t-138 -144.5t-51 -185q0 -160 106.5 -279.5t263.5 -119.5q177 0 258 95q56 63 83 116l84 -152q-15 -34 -44 -70l1 -1q-131 -152 -388 -152q-147 0 -269.5 79t-190.5 207.5t-68 274.5q0 105 43.5 206t116 176.5t172 121.5t204.5 46q87 0 159 -19t123.5 -50 t95 -80t72.5 -99t58.5 -117t50.5 -124.5t50 -130.5t55 -127q96 -200 233 -200q81 0 138.5 48.5t57.5 128.5q0 42 -19 72t-50.5 46t-72.5 31.5t-84.5 27t-87.5 34t-81 52t-65 82t-39 122.5q-3 16 -3 33q0 110 87.5 192t198.5 78q78 -3 120.5 -14.5t90.5 -53.5h-1 q12 -11 23 -24.5t26 -36t19 -27.5l-129 -99q-26 49 -54 70v1q-23 21 -97 21q-49 0 -84 -33t-35 -83z"/>
-   <glyph d="M1432 484q0 173 -234 239q-35 10 -53 16.5t-38 25t-29 46.5q0 2 -2 8.5t-3 12t-1 7.5q0 36 24.5 59.5t60.5 23.5q54 0 71 -15h-1q20 -15 39 -51l93 71q-39 54 -49 64q-33 29 -67.5 39t-85.5 10q-80 0 -142 -57.5t-62 -137.5q0 -7 2 -23q16 -96 64.5 -140t148.5 -73 q29 -8 49 -15.5t45 -21.5t38.5 -34.5t13.5 -46.5v-5q1 -58 -40.5 -93t-100.5 -35q-97 0 -167 144q-23 47 -51.5 121.5t-48 125.5t-54 110.5t-74 95.5t-103.5 60.5t-147 24.5q-101 0 -192 -56t-144 -148t-50 -192v-1q4 -108 50.5 -199t133.5 -147.5t196 -56.5q186 0 279 110 q20 27 31 51l-60 109q-42 -80 -99 -116t-146 -36q-115 0 -191 87t-76 204q0 105 82 189t186 84q112 0 170 -53.5t104 -172.5q8 -21 25.5 -68.5t28.5 -76.5t31.5 -74.5t38.5 -74t45.5 -62.5t55.5 -53.5t66 -33t80 -13.5q107 0 183 69.5t76 174.5zM1536 1120v-960 q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1152 640q0 104 -40.5 198.5t-109.5 163.5t-163.5 109.5t-198.5 40.5t-198.5 -40.5t-163.5 -109.5t-109.5 -163.5t-40.5 -198.5t40.5 -198.5t109.5 -163.5t163.5 -109.5t198.5 -40.5t198.5 40.5t163.5 109.5t109.5 163.5t40.5 198.5zM1920 640q0 104 -40.5 198.5 t-109.5 163.5t-163.5 109.5t-198.5 40.5h-386q119 -90 188.5 -224t69.5 -288t-69.5 -288t-188.5 -224h386q104 0 198.5 40.5t163.5 109.5t109.5 163.5t40.5 198.5zM2048 640q0 -130 -51 -248.5t-136.5 -204t-204 -136.5t-248.5 -51h-768q-130 0 -248.5 51t-204 136.5 t-136.5 204t-51 248.5t51 248.5t136.5 204t204 136.5t248.5 51h768q130 0 248.5 -51t204 -136.5t136.5 -204t51 -248.5z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M0 640q0 130 51 248.5t136.5 204t204 136.5t248.5 51h768q130 0 248.5 -51t204 -136.5t136.5 -204t51 -248.5t-51 -248.5t-136.5 -204t-204 -136.5t-248.5 -51h-768q-130 0 -248.5 51t-204 136.5t-136.5 204t-51 248.5zM1408 128q104 0 198.5 40.5t163.5 109.5 t109.5 163.5t40.5 198.5t-40.5 198.5t-109.5 163.5t-163.5 109.5t-198.5 40.5t-198.5 -40.5t-163.5 -109.5t-109.5 -163.5t-40.5 -198.5t40.5 -198.5t109.5 -163.5t163.5 -109.5t198.5 -40.5z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M762 384h-314q-40 0 -57.5 35t6.5 67l188 251q-65 31 -137 31q-132 0 -226 -94t-94 -226t94 -226t226 -94q115 0 203 72.5t111 183.5zM576 512h186q-18 85 -75 148zM1056 512l288 384h-480l-99 -132q105 -103 126 -252h165zM2176 448q0 132 -94 226t-226 94 q-60 0 -121 -24l174 -260q15 -23 10 -49t-27 -40q-15 -11 -36 -11q-35 0 -53 29l-174 260q-93 -95 -93 -225q0 -132 94 -226t226 -94t226 94t94 226zM2304 448q0 -185 -131.5 -316.5t-316.5 -131.5t-316.5 131.5t-131.5 316.5q0 97 39.5 183.5t109.5 149.5l-65 98l-353 -469 q-18 -26 -51 -26h-197q-23 -164 -149 -274t-294 -110q-185 0 -316.5 131.5t-131.5 316.5t131.5 316.5t316.5 131.5q114 0 215 -55l137 183h-224q-26 0 -45 19t-19 45t19 45t45 19h384v-128h435l-85 128h-222q-26 0 -45 19t-19 45t19 45t45 19h256q33 0 53 -28l267 -400 q91 44 192 44q185 0 316.5 -131.5t131.5 -316.5z"/>
-   <glyph d="M384 320q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1408 320q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1362 716l-72 384q-5 23 -22.5 37.5t-40.5 14.5 h-918q-23 0 -40.5 -14.5t-22.5 -37.5l-72 -384q-5 -30 14 -53t49 -23h1062q30 0 49 23t14 53zM1136 1328q0 20 -14 34t-34 14h-640q-20 0 -34 -14t-14 -34t14 -34t34 -14h640q20 0 34 14t14 34zM1536 603v-603h-128v-128q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5 t-37.5 90.5v128h-768v-128q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5v128h-128v603q0 112 25 223l103 454q9 78 97.5 137t230 89t312.5 30t312.5 -30t230 -89t97.5 -137l105 -454q23 -102 23 -223z" unicode=""/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1463 704q0 -35 -25 -60.5t-61 -25.5h-702q-36 0 -61 25.5t-25 60.5t25 60.5t61 25.5h702q36 0 61 -25.5t25 -60.5zM1677 704q0 86 -23 170h-982q-36 0 -61 25t-25 60q0 36 25 61t61 25h908q-88 143 -235 227t-320 84q-177 0 -327.5 -87.5t-238 -237.5t-87.5 -327 q0 -86 23 -170h982q36 0 61 -25t25 -60q0 -36 -25 -61t-61 -25h-908q88 -143 235.5 -227t320.5 -84q132 0 253 51.5t208 139t139 208t52 253.5zM2048 959q0 -35 -25 -60t-61 -25h-131q17 -85 17 -170q0 -167 -65.5 -319.5t-175.5 -263t-262.5 -176t-319.5 -65.5 q-246 0 -448.5 133t-301.5 350h-189q-36 0 -61 25t-25 61q0 35 25 60t61 25h132q-17 85 -17 170q0 167 65.5 319.5t175.5 263t262.5 176t320.5 65.5q245 0 447.5 -133t301.5 -350h188q36 0 61 -25t25 -61z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M953 1158l-114 -328l117 -21q165 451 165 518q0 56 -38 56q-57 0 -130 -225zM654 471l33 -88q37 42 71 67l-33 5.5t-38.5 7t-32.5 8.5zM362 1367q0 -98 159 -521q18 10 49 10q15 0 75 -5l-121 351q-75 220 -123 220q-19 0 -29 -17.5t-10 -37.5zM283 608q0 -36 51.5 -119 t117.5 -153t100 -70q14 0 25.5 13t11.5 27q0 24 -32 102q-13 32 -32 72t-47.5 89t-61.5 81t-62 32q-20 0 -45.5 -27t-25.5 -47zM125 273q0 -41 25 -104q59 -145 183.5 -227t281.5 -82q227 0 382 170q152 169 152 427q0 43 -1 67t-11.5 62t-30.5 56q-56 49 -211.5 75.5 t-270.5 26.5q-37 0 -49 -11q-12 -5 -12 -35q0 -34 21.5 -60t55.5 -40t77.5 -23.5t87.5 -11.5t85 -4t70 0h23q24 0 40 -19q15 -19 19 -55q-28 -28 -96 -54q-61 -22 -93 -46q-64 -46 -108.5 -114t-44.5 -137q0 -31 18.5 -88.5t18.5 -87.5l-3 -12q-4 -12 -4 -14 q-137 10 -146 216q-8 -2 -41 -2q2 -7 2 -21q0 -53 -40.5 -89.5t-94.5 -36.5q-82 0 -166.5 78t-84.5 159q0 34 33 67q52 -64 60 -76q77 -104 133 -104q12 0 26.5 8.5t14.5 20.5q0 34 -87.5 145t-116.5 111q-43 0 -70 -44.5t-27 -90.5zM11 264q0 101 42.5 163t136.5 88 q-28 74 -28 104q0 62 61 123t122 61q29 0 70 -15q-163 462 -163 567q0 80 41 130.5t119 50.5q131 0 325 -581q6 -17 8 -23q6 16 29 79.5t43.5 118.5t54 127.5t64.5 123t70.5 86.5t76.5 36q71 0 112 -49t41 -122q0 -108 -159 -550q61 -15 100.5 -46t58.5 -78t26 -93.5 t7 -110.5q0 -150 -47 -280t-132 -225t-211 -150t-278 -55q-111 0 -223 42q-149 57 -258 191.5t-109 286.5z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M785 528h207q-14 -158 -98.5 -248.5t-214.5 -90.5q-162 0 -254.5 116t-92.5 316q0 194 93 311.5t233 117.5q148 0 232 -87t97 -247h-203q-5 64 -35.5 99t-81.5 35q-57 0 -88.5 -60.5t-31.5 -177.5q0 -48 5 -84t18 -69.5t40 -51.5t66 -18q95 0 109 139zM1497 528h206 q-14 -158 -98 -248.5t-214 -90.5q-162 0 -254.5 116t-92.5 316q0 194 93 311.5t233 117.5q148 0 232 -87t97 -247h-204q-4 64 -35 99t-81 35q-57 0 -88.5 -60.5t-31.5 -177.5q0 -48 5 -84t18 -69.5t39.5 -51.5t65.5 -18q49 0 76.5 38t33.5 101zM1856 647q0 207 -15.5 307 t-60.5 161q-6 8 -13.5 14t-21.5 15t-16 11q-86 63 -697 63q-625 0 -710 -63q-5 -4 -17.5 -11.5t-21 -14t-14.5 -14.5q-45 -60 -60 -159.5t-15 -308.5q0 -208 15 -307.5t60 -160.5q6 -8 15 -15t20.5 -14t17.5 -12q44 -33 239.5 -49t470.5 -16q610 0 697 65q5 4 17 11t20.5 14 t13.5 16q46 60 61 159t15 309zM2048 1408v-1536h-2048v1536h2048z"/>
-   <glyph d="M992 912v-496q0 -14 -9 -23t-23 -9h-160q-14 0 -23 9t-9 23v496q0 112 -80 192t-192 80h-272v-1152q0 -14 -9 -23t-23 -9h-160q-14 0 -23 9t-9 23v1344q0 14 9 23t23 9h464q135 0 249 -66.5t180.5 -180.5t66.5 -249zM1376 1376v-880q0 -135 -66.5 -249t-180.5 -180.5 t-249 -66.5h-464q-14 0 -23 9t-9 23v960q0 14 9 23t23 9h160q14 0 23 -9t9 -23v-768h272q112 0 192 80t80 192v880q0 14 9 23t23 9h160q14 0 23 -9t9 -23z" unicode=""/>
-   <glyph d="M1311 694v-114q0 -24 -13.5 -38t-37.5 -14h-202q-24 0 -38 14t-14 38v114q0 24 14 38t38 14h202q24 0 37.5 -14t13.5 -38zM821 464v250q0 53 -32.5 85.5t-85.5 32.5h-133q-68 0 -96 -52q-28 52 -96 52h-130q-53 0 -85.5 -32.5t-32.5 -85.5v-250q0 -22 21 -22h55 q22 0 22 22v230q0 24 13.5 38t38.5 14h94q24 0 38 -14t14 -38v-230q0 -22 21 -22h54q22 0 22 22v230q0 24 14 38t38 14h97q24 0 37.5 -14t13.5 -38v-230q0 -22 22 -22h55q21 0 21 22zM1410 560v154q0 53 -33 85.5t-86 32.5h-264q-53 0 -86 -32.5t-33 -85.5v-410 q0 -21 22 -21h55q21 0 21 21v180q31 -42 94 -42h191q53 0 86 32.5t33 85.5zM1536 1176v-1072q0 -96 -68 -164t-164 -68h-1072q-96 0 -164 68t-68 164v1072q0 96 68 164t164 68h1072q96 0 164 -68t68 -164z" unicode=""/>
-   <glyph d="M915 450h-294l147 551zM1001 128h311l-324 1024h-440l-324 -1024h311l383 314zM1536 1120v-960q0 -118 -85 -203t-203 -85h-960q-118 0 -203 85t-85 203v960q0 118 85 203t203 85h960q118 0 203 -85t85 -203z" unicode=""/>
-   <glyph horiz-adv-x="2048" unicode="" d="M2048 641q0 -21 -13 -36.5t-33 -19.5l-205 -356q3 -9 3 -18q0 -20 -12.5 -35.5t-32.5 -19.5l-193 -337q3 -8 3 -16q0 -23 -16.5 -40t-40.5 -17q-25 0 -41 18h-400q-17 -20 -43 -20t-43 20h-399q-17 -20 -43 -20q-23 0 -40 16.5t-17 40.5q0 8 4 20l-193 335 q-20 4 -32.5 19.5t-12.5 35.5q0 9 3 18l-206 356q-20 5 -32.5 20.5t-12.5 35.5q0 21 13.5 36.5t33.5 19.5l199 344q0 1 -0.5 3t-0.5 3q0 36 34 51l209 363q-4 10 -4 18q0 24 17 40.5t40 16.5q26 0 44 -21h396q16 21 43 21t43 -21h398q18 21 44 21q23 0 40 -16.5t17 -40.5 q0 -6 -4 -18l207 -358q23 -1 39 -17.5t16 -38.5q0 -13 -7 -27l187 -324q19 -4 31.5 -19.5t12.5 -35.5zM1063 -158h389l-342 354h-143l-342 -354h360q18 16 39 16t39 -16zM112 654q1 -4 1 -13q0 -10 -2 -15l208 -360q2 0 4.5 -1t5.5 -2.5l5 -2.5l188 199v347l-187 194 q-13 -8 -29 -10zM986 1438h-388l190 -200l554 200h-280q-16 -16 -38 -16t-38 16zM1689 226q1 6 5 11l-64 68l-17 -79h76zM1583 226l22 105l-252 266l-296 -307l63 -64h463zM1495 -142l16 28l65 310h-427l333 -343q8 4 13 5zM578 -158h5l342 354h-373v-335l4 -6q14 -5 22 -13 zM552 226h402l64 66l-309 321l-157 -166v-221zM359 226h163v189l-168 -177q4 -8 5 -12zM358 1051q0 -1 0.5 -2t0.5 -2q0 -16 -8 -29l171 -177v269zM552 1121v-311l153 -157l297 314l-223 236zM556 1425l-4 -8v-264l205 74l-191 201q-6 -2 -10 -3zM1447 1438h-16l-621 -224 l213 -225zM1023 946l-297 -315l311 -319l296 307zM688 634l-136 141v-284zM1038 270l-42 -44h85zM1374 618l238 -251l132 624l-3 5l-1 1zM1718 1018q-8 13 -8 29v2l-216 376q-5 1 -13 5l-437 -463l310 -327zM522 1142v223l-163 -282zM522 196h-163l163 -283v283zM1607 196 l-48 -227l130 227h-82zM1729 266l207 361q-2 10 -2 14q0 1 3 16l-171 296l-129 -612l77 -82q5 3 15 7z"/>
-   <glyph d="M0 856q0 131 91.5 226.5t222.5 95.5h742l352 358v-1470q0 -132 -91.5 -227t-222.5 -95h-780q-131 0 -222.5 95t-91.5 227v790zM1232 102l-176 180v425q0 46 -32 79t-78 33h-484q-46 0 -78 -33t-32 -79v-492q0 -46 32.5 -79.5t77.5 -33.5h770z" unicode=""/>
-   <glyph d="M934 1386q-317 -121 -556 -362.5t-358 -560.5q-20 89 -20 176q0 208 102.5 384.5t278.5 279t384 102.5q82 0 169 -19zM1203 1267q93 -65 164 -155q-389 -113 -674.5 -400.5t-396.5 -676.5q-93 72 -155 162q112 386 395 671t667 399zM470 -67q115 356 379.5 622t619.5 384 q40 -92 54 -195q-292 -120 -516 -345t-343 -518q-103 14 -194 52zM1536 -125q-193 50 -367 115q-135 -84 -290 -107q109 205 274 370.5t369 275.5q-21 -152 -101 -284q65 -175 115 -370z" unicode=""/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1893 1144l155 -1272q-131 0 -257 57q-200 91 -393 91q-226 0 -374 -148q-148 148 -374 148q-193 0 -393 -91q-128 -57 -252 -57h-5l155 1272q224 127 482 127q233 0 387 -106q154 106 387 106q258 0 482 -127zM1398 157q129 0 232 -28.5t260 -93.5l-124 1021 q-171 78 -368 78q-224 0 -374 -141q-150 141 -374 141q-197 0 -368 -78l-124 -1021q105 43 165.5 65t148.5 39.5t178 17.5q202 0 374 -108q172 108 374 108zM1438 191l-55 907q-211 -4 -359 -155q-152 155 -374 155q-176 0 -336 -66l-114 -941q124 51 228.5 76t221.5 25 q209 0 374 -102q172 107 374 102z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1500 165v733q0 21 -15 36t-35 15h-93q-20 0 -35 -15t-15 -36v-733q0 -20 15 -35t35 -15h93q20 0 35 15t15 35zM1216 165v531q0 20 -15 35t-35 15h-101q-20 0 -35 -15t-15 -35v-531q0 -20 15 -35t35 -15h101q20 0 35 15t15 35zM924 165v429q0 20 -15 35t-35 15h-101 q-20 0 -35 -15t-15 -35v-429q0 -20 15 -35t35 -15h101q20 0 35 15t15 35zM632 165v362q0 20 -15 35t-35 15h-101q-20 0 -35 -15t-15 -35v-362q0 -20 15 -35t35 -15h101q20 0 35 15t15 35zM2048 311q0 -166 -118 -284t-284 -118h-1244q-166 0 -284 118t-118 284 q0 116 63 214.5t168 148.5q-10 34 -10 73q0 113 80.5 193.5t193.5 80.5q102 0 180 -67q45 183 194 300t338 117q149 0 275 -73.5t199.5 -199.5t73.5 -275q0 -66 -14 -122q135 -33 221 -142.5t86 -247.5z"/>
-   <glyph d="M0 1536h1536v-1392l-776 -338l-760 338v1392zM1436 209v926h-1336v-926l661 -294zM1436 1235v201h-1336v-201h1336zM181 937v-115h-37v115h37zM181 789v-115h-37v115h37zM181 641v-115h-37v115h37zM181 493v-115h-37v115h37zM181 345v-115h-37v115h37zM207 202l15 34 l105 -47l-15 -33zM343 142l15 34l105 -46l-15 -34zM478 82l15 34l105 -46l-15 -34zM614 23l15 33l104 -46l-15 -34zM797 10l105 46l15 -33l-105 -47zM932 70l105 46l15 -34l-105 -46zM1068 130l105 46l15 -34l-105 -46zM1203 189l105 47l15 -34l-105 -46zM259 1389v-36h-114 v36h114zM421 1389v-36h-115v36h115zM583 1389v-36h-115v36h115zM744 1389v-36h-114v36h114zM906 1389v-36h-114v36h114zM1068 1389v-36h-115v36h115zM1230 1389v-36h-115v36h115zM1391 1389v-36h-114v36h114zM181 1049v-79h-37v115h115v-36h-78zM421 1085v-36h-115v36h115z M583 1085v-36h-115v36h115zM744 1085v-36h-114v36h114zM906 1085v-36h-114v36h114zM1068 1085v-36h-115v36h115zM1230 1085v-36h-115v36h115zM1355 970v79h-78v36h115v-115h-37zM1355 822v115h37v-115h-37zM1355 674v115h37v-115h-37zM1355 526v115h37v-115h-37zM1355 378 v115h37v-115h-37zM1355 230v115h37v-115h-37zM760 265q-129 0 -221 91.5t-92 221.5q0 129 92 221t221 92q130 0 221.5 -92t91.5 -221q0 -130 -91.5 -221.5t-221.5 -91.5zM595 646q0 -36 19.5 -56.5t49.5 -25t64 -7t64 -2t49.5 -9t19.5 -30.5q0 -49 -112 -49q-97 0 -123 51 h-3l-31 -63q67 -42 162 -42q29 0 56.5 5t55.5 16t45.5 33t17.5 53q0 46 -27.5 69.5t-67.5 27t-79.5 3t-67 5t-27.5 25.5q0 21 20.5 33t40.5 15t41 3q34 0 70.5 -11t51.5 -34h3l30 58q-3 1 -21 8.5t-22.5 9t-19.5 7t-22 7t-20 4.5t-24 4t-23 1q-29 0 -56.5 -5t-54 -16.5 t-43 -34t-16.5 -53.5z" unicode=""/>
-   <glyph horiz-adv-x="2048" unicode="" d="M863 504q0 112 -79.5 191.5t-191.5 79.5t-191 -79.5t-79 -191.5t79 -191t191 -79t191.5 79t79.5 191zM1726 505q0 112 -79 191t-191 79t-191.5 -79t-79.5 -191q0 -113 79.5 -192t191.5 -79t191 79.5t79 191.5zM2048 1314v-1348q0 -44 -31.5 -75.5t-76.5 -31.5h-1832 q-45 0 -76.5 31.5t-31.5 75.5v1348q0 44 31.5 75.5t76.5 31.5h431q44 0 76 -31.5t32 -75.5v-161h754v161q0 44 32 75.5t76 31.5h431q45 0 76.5 -31.5t31.5 -75.5z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1430 953zM1690 749q148 0 253 -98.5t105 -244.5q0 -157 -109 -261.5t-267 -104.5q-85 0 -162 27.5t-138 73.5t-118 106t-109 126.5t-103.5 132.5t-108.5 126t-117 106t-136 73.5t-159 27.5q-154 0 -251.5 -91.5t-97.5 -244.5q0 -157 104 -250t263 -93q100 0 208 37.5 t193 98.5q5 4 21 18.5t30 24t22 9.5q14 0 24.5 -10.5t10.5 -24.5q0 -24 -60 -77q-101 -88 -234.5 -142t-260.5 -54q-133 0 -245.5 58t-180 165t-67.5 241q0 205 141.5 341t347.5 136q120 0 226.5 -43.5t185.5 -113t151.5 -153t139 -167.5t133.5 -153.5t149.5 -113 t172.5 -43.5q102 0 168.5 61.5t66.5 162.5q0 95 -64.5 159t-159.5 64q-30 0 -81.5 -18.5t-68.5 -18.5q-20 0 -35.5 15t-15.5 35q0 18 8.5 57t8.5 59q0 159 -107.5 263t-266.5 104q-58 0 -111.5 -18.5t-84 -40.5t-55.5 -40.5t-33 -18.5q-15 0 -25.5 10.5t-10.5 25.5 q0 19 25 46q59 67 147 103.5t182 36.5q191 0 318 -125.5t127 -315.5q0 -37 -4 -66q57 15 115 15z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1216 832q0 26 -19 45t-45 19h-128v128q0 26 -19 45t-45 19t-45 -19t-19 -45v-128h-128q-26 0 -45 -19t-19 -45t19 -45t45 -19h128v-128q0 -26 19 -45t45 -19t45 19t19 45v128h128q26 0 45 19t19 45zM640 0q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5 t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1536 0q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1664 1088v-512q0 -24 -16 -42.5t-41 -21.5l-1044 -122q1 -7 4.5 -21.5t6 -26.5t2.5 -22q0 -16 -24 -64h920 q26 0 45 -19t19 -45t-19 -45t-45 -19h-1024q-26 0 -45 19t-19 45q0 14 11 39.5t29.5 59.5t20.5 38l-177 823h-204q-26 0 -45 19t-19 45t19 45t45 19h256q16 0 28.5 -6.5t20 -15.5t13 -24.5t7.5 -26.5t5.5 -29.5t4.5 -25.5h1201q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="1664" unicode="" d="M1280 832q0 26 -19 45t-45 19t-45 -19l-147 -146v293q0 26 -19 45t-45 19t-45 -19t-19 -45v-293l-147 146q-19 19 -45 19t-45 -19t-19 -45t19 -45l256 -256q19 -19 45 -19t45 19l256 256q19 19 19 45zM640 0q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5 t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1536 0q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1664 1088v-512q0 -24 -16 -42.5t-41 -21.5l-1044 -122q1 -7 4.5 -21.5t6 -26.5t2.5 -22q0 -16 -24 -64h920 q26 0 45 -19t19 -45t-19 -45t-45 -19h-1024q-26 0 -45 19t-19 45q0 14 11 39.5t29.5 59.5t20.5 38l-177 823h-204q-26 0 -45 19t-19 45t19 45t45 19h256q16 0 28.5 -6.5t20 -15.5t13 -24.5t7.5 -26.5t5.5 -29.5t4.5 -25.5h1201q26 0 45 -19t19 -45z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M212 768l623 -665l-300 665h-323zM1024 -4l349 772h-698zM538 896l204 384h-262l-288 -384h346zM1213 103l623 665h-323zM683 896h682l-204 384h-274zM1510 896h346l-288 384h-262zM1651 1382l384 -512q14 -18 13 -41.5t-17 -40.5l-960 -1024q-18 -20 -47 -20t-47 20 l-960 1024q-16 17 -17 40.5t13 41.5l384 512q18 26 51 26h1152q33 0 51 -26z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1811 -19q19 19 45 19t45 -19l128 -128l-90 -90l-83 83l-83 -83q-18 -19 -45 -19t-45 19l-83 83l-83 -83q-19 -19 -45 -19t-45 19l-83 83l-83 -83q-19 -19 -45 -19t-45 19l-83 83l-83 -83q-19 -19 -45 -19t-45 19l-83 83l-83 -83q-19 -19 -45 -19t-45 19l-83 83l-83 -83 q-19 -19 -45 -19t-45 19l-83 83l-83 -83q-19 -19 -45 -19t-45 19l-128 128l90 90l83 -83l83 83q19 19 45 19t45 -19l83 -83l83 83q19 19 45 19t45 -19l83 -83l83 83q19 19 45 19t45 -19l83 -83l83 83q19 19 45 19t45 -19l83 -83l83 83q19 19 45 19t45 -19l83 -83l83 83 q19 19 45 19t45 -19l83 -83zM237 19q-19 -19 -45 -19t-45 19l-128 128l90 90l83 -82l83 82q19 19 45 19t45 -19l83 -82l64 64v293l-210 314q-17 26 -7 56.5t40 40.5l177 58v299h128v128h256v128h256v-128h256v-128h128v-299l177 -58q30 -10 40 -40.5t-7 -56.5l-210 -314 v-293l19 18q19 19 45 19t45 -19l83 -82l83 82q19 19 45 19t45 -19l128 -128l-90 -90l-83 83l-83 -83q-18 -19 -45 -19t-45 19l-83 83l-83 -83q-19 -19 -45 -19t-45 19l-83 83l-83 -83q-19 -19 -45 -19t-45 19l-83 83l-83 -83q-19 -19 -45 -19t-45 19l-83 83l-83 -83 q-19 -19 -45 -19t-45 19l-83 83l-83 -83q-19 -19 -45 -19t-45 19l-83 83zM640 1152v-128l384 128l384 -128v128h-128v128h-512v-128h-128z"/>
-   <glyph d="M576 0l96 448l-96 128l-128 64zM832 0l128 640l-128 -64l-96 -128zM992 1010q-2 4 -4 6q-10 8 -96 8q-70 0 -167 -19q-7 -2 -21 -2t-21 2q-97 19 -167 19q-86 0 -96 -8q-2 -2 -4 -6q2 -18 4 -27q2 -3 7.5 -6.5t7.5 -10.5q2 -4 7.5 -20.5t7 -20.5t7.5 -17t8.5 -17t9 -14 t12 -13.5t14 -9.5t17.5 -8t20.5 -4t24.5 -2q36 0 59 12.5t32.5 30t14.5 34.5t11.5 29.5t17.5 12.5h12q11 0 17.5 -12.5t11.5 -29.5t14.5 -34.5t32.5 -30t59 -12.5q13 0 24.5 2t20.5 4t17.5 8t14 9.5t12 13.5t9 14t8.5 17t7.5 17t7 20.5t7.5 20.5q2 7 7.5 10.5t7.5 6.5 q2 9 4 27zM1408 131q0 -121 -73 -190t-194 -69h-874q-121 0 -194 69t-73 190q0 61 4.5 118t19 125.5t37.5 123.5t63.5 103.5t93.5 74.5l-90 220h214q-22 64 -22 128q0 12 2 32q-194 40 -194 96q0 57 210 99q17 62 51.5 134t70.5 114q32 37 76 37q30 0 84 -31t84 -31t84 31 t84 31q44 0 76 -37q36 -42 70.5 -114t51.5 -134q210 -42 210 -99q0 -56 -194 -96q7 -81 -20 -160h214l-82 -225q63 -33 107.5 -96.5t65.5 -143.5t29 -151.5t8 -148.5z" unicode=""/>
-   <glyph horiz-adv-x="2304" unicode="" d="M2301 500q12 -103 -22 -198.5t-99 -163.5t-158.5 -106t-196.5 -31q-161 11 -279.5 125t-134.5 274q-12 111 27.5 210.5t118.5 170.5l-71 107q-96 -80 -151 -194t-55 -244q0 -27 -18.5 -46.5t-45.5 -19.5h-256h-69q-23 -164 -149 -274t-294 -110q-185 0 -316.5 131.5 t-131.5 316.5t131.5 316.5t316.5 131.5q76 0 152 -27l24 45q-123 110 -304 110h-64q-26 0 -45 19t-19 45t19 45t45 19h128q78 0 145 -13.5t116.5 -38.5t71.5 -39.5t51 -36.5h512h115l-85 128h-222q-30 0 -49 22.5t-14 52.5q4 23 23 38t43 15h253q33 0 53 -28l70 -105 l114 114q19 19 46 19h101q26 0 45 -19t19 -45v-128q0 -26 -19 -45t-45 -19h-179l115 -172q131 63 275 36q143 -26 244 -134.5t118 -253.5zM448 128q115 0 203 72.5t111 183.5h-314q-35 0 -55 31q-18 32 -1 63l147 277q-47 13 -91 13q-132 0 -226 -94t-94 -226t94 -226 t226 -94zM1856 128q132 0 226 94t94 226t-94 226t-226 94q-60 0 -121 -24l174 -260q15 -23 10 -49t-27 -40q-15 -11 -36 -11q-35 0 -53 29l-174 260q-93 -95 -93 -225q0 -132 94 -226t226 -94z"/>
-   <glyph d="M1408 0q0 -63 -61.5 -113.5t-164 -81t-225 -46t-253.5 -15.5t-253.5 15.5t-225 46t-164 81t-61.5 113.5q0 49 33 88.5t91 66.5t118 44.5t131 29.5q26 5 48 -10.5t26 -41.5q5 -26 -10.5 -48t-41.5 -26q-58 -10 -106 -23.5t-76.5 -25.5t-48.5 -23.5t-27.5 -19.5t-8.5 -12 q3 -11 27 -26.5t73 -33t114 -32.5t160.5 -25t201.5 -10t201.5 10t160.5 25t114 33t73 33.5t27 27.5q-1 4 -8.5 11t-27.5 19t-48.5 23.5t-76.5 25t-106 23.5q-26 4 -41.5 26t-10.5 48q4 26 26 41.5t48 10.5q71 -12 131 -29.5t118 -44.5t91 -66.5t33 -88.5zM1024 896v-384 q0 -26 -19 -45t-45 -19h-64v-384q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v384h-64q-26 0 -45 19t-19 45v384q0 53 37.5 90.5t90.5 37.5h384q53 0 90.5 -37.5t37.5 -90.5zM928 1280q0 -93 -65.5 -158.5t-158.5 -65.5t-158.5 65.5t-65.5 158.5t65.5 158.5t158.5 65.5 t158.5 -65.5t65.5 -158.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1280 512h305q-5 -6 -10 -10.5t-9 -7.5l-3 -4l-623 -600q-18 -18 -44 -18t-44 18l-624 602q-5 2 -21 20h369q22 0 39.5 13.5t22.5 34.5l70 281l190 -667q6 -20 23 -33t39 -13q21 0 38 13t23 33l146 485l56 -112q18 -35 57 -35zM1792 940q0 -145 -103 -300h-369l-111 221 q-8 17 -25.5 27t-36.5 8q-45 -5 -56 -46l-129 -430l-196 686q-6 20 -23.5 33t-39.5 13t-39 -13.5t-22 -34.5l-116 -464h-423q-103 155 -103 300q0 220 127 344t351 124q62 0 126.5 -21.5t120 -58t95.5 -68.5t76 -68q36 36 76 68t95.5 68.5t120 58t126.5 21.5q224 0 351 -124 t127 -344z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1152 960q0 -221 -147.5 -384.5t-364.5 -187.5v-260h224q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-224v-224q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v224h-224q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h224v260q-150 16 -271.5 103t-186 224t-52.5 292 q11 134 80.5 249t182 188t245.5 88q170 19 319 -54t236 -212t87 -306zM128 960q0 -185 131.5 -316.5t316.5 -131.5t316.5 131.5t131.5 316.5t-131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5z"/>
-   <glyph d="M1472 1408q26 0 45 -19t19 -45v-416q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v262l-382 -383q126 -156 126 -359q0 -117 -45.5 -223.5t-123 -184t-184 -123t-223.5 -45.5t-223.5 45.5t-184 123t-123 184t-45.5 223.5t45.5 223.5t123 184t184 123t223.5 45.5 q203 0 359 -126l382 382h-261q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h416zM576 0q185 0 316.5 131.5t131.5 316.5t-131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5z" unicode=""/>
-   <glyph horiz-adv-x="1280" unicode="" d="M830 1220q145 -72 233.5 -210.5t88.5 -305.5q0 -221 -147.5 -384.5t-364.5 -187.5v-132h96q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-96v-96q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v96h-96q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h96v132q-217 24 -364.5 187.5 t-147.5 384.5q0 167 88.5 305.5t233.5 210.5q-165 96 -228 273q-6 16 3.5 29.5t26.5 13.5h69q21 0 29 -20q44 -106 140 -171t214 -65t214 65t140 171q8 20 37 20h61q17 0 26.5 -13.5t3.5 -29.5q-63 -177 -228 -273zM576 256q185 0 316.5 131.5t131.5 316.5t-131.5 316.5 t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5z"/>
-   <glyph d="M1024 1504q0 14 9 23t23 9h288q26 0 45 -19t19 -45v-288q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v134l-254 -255q126 -158 126 -359q0 -221 -147.5 -384.5t-364.5 -187.5v-132h96q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-96v-96q0 -14 -9 -23t-23 -9h-64 q-14 0 -23 9t-9 23v96h-96q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h96v132q-149 16 -270.5 103t-186.5 223.5t-53 291.5q16 204 160 353.5t347 172.5q118 14 228 -19t198 -103l255 254h-134q-14 0 -23 9t-9 23v64zM576 256q185 0 316.5 131.5t131.5 316.5t-131.5 316.5 t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1280 1504q0 14 9 23t23 9h288q26 0 45 -19t19 -45v-288q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v134l-254 -255q126 -158 126 -359q0 -221 -147.5 -384.5t-364.5 -187.5v-132h96q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-96v-96q0 -14 -9 -23t-23 -9h-64 q-14 0 -23 9t-9 23v96h-96q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h96v132q-217 24 -364.5 187.5t-147.5 384.5q0 201 126 359l-52 53l-101 -111q-9 -10 -22 -10.5t-23 7.5l-48 44q-10 8 -10.5 21.5t8.5 23.5l105 115l-111 112v-134q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9 t-9 23v288q0 26 19 45t45 19h288q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-133l106 -107l86 94q9 10 22 10.5t23 -7.5l48 -44q10 -8 10.5 -21.5t-8.5 -23.5l-90 -99l57 -56q158 126 359 126t359 -126l255 254h-134q-14 0 -23 9t-9 23v64zM832 256q185 0 316.5 131.5 t131.5 316.5t-131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1790 1007q12 -155 -52.5 -292t-186 -224t-271.5 -103v-260h224q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-224v-224q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v224h-512v-224q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v224h-224q-14 0 -23 9t-9 23v64q0 14 9 23 t23 9h224v260q-150 16 -271.5 103t-186 224t-52.5 292q17 206 164.5 356.5t352.5 169.5q206 21 377 -94q171 115 377 94q205 -19 352.5 -169.5t164.5 -356.5zM896 647q128 131 128 313t-128 313q-128 -131 -128 -313t128 -313zM576 512q115 0 218 57q-154 165 -154 391 q0 224 154 391q-103 57 -218 57q-185 0 -316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5zM1152 128v260q-137 15 -256 94q-119 -79 -256 -94v-260h512zM1216 512q185 0 316.5 131.5t131.5 316.5t-131.5 316.5t-316.5 131.5q-115 0 -218 -57q154 -167 154 -391 q0 -226 -154 -391q103 -57 218 -57z"/>
-   <glyph horiz-adv-x="1920" unicode="" d="M1536 1120q0 14 9 23t23 9h288q26 0 45 -19t19 -45v-288q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v134l-254 -255q76 -95 107.5 -214t9.5 -247q-31 -182 -166 -312t-318 -156q-210 -29 -384.5 80t-241.5 300q-117 6 -221 57.5t-177.5 133t-113.5 192.5t-32 230 q9 135 78 252t182 191.5t248 89.5q118 14 227.5 -19t198.5 -103l255 254h-134q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h288q26 0 45 -19t19 -45v-288q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v134l-254 -255q59 -74 93 -169q182 -9 328 -124l255 254h-134q-14 0 -23 9 t-9 23v64zM1024 704q0 20 -4 58q-162 -25 -271 -150t-109 -292q0 -20 4 -58q162 25 271 150t109 292zM128 704q0 -168 111 -294t276 -149q-3 29 -3 59q0 210 135 369.5t338 196.5q-53 120 -163.5 193t-245.5 73q-185 0 -316.5 -131.5t-131.5 -316.5zM1088 -128 q185 0 316.5 131.5t131.5 316.5q0 168 -111 294t-276 149q3 -29 3 -59q0 -210 -135 -369.5t-338 -196.5q53 -120 163.5 -193t245.5 -73z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1664 1504q0 14 9 23t23 9h288q26 0 45 -19t19 -45v-288q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v134l-254 -255q76 -95 107.5 -214t9.5 -247q-32 -180 -164.5 -310t-313.5 -157q-223 -34 -409 90q-117 -78 -256 -93v-132h96q14 0 23 -9t9 -23v-64q0 -14 -9 -23 t-23 -9h-96v-96q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v96h-96q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h96v132q-155 17 -279.5 109.5t-187 237.5t-39.5 307q25 187 159.5 322.5t320.5 164.5q224 34 410 -90q146 97 320 97q201 0 359 -126l255 254h-134q-14 0 -23 9 t-9 23v64zM896 391q128 131 128 313t-128 313q-128 -131 -128 -313t128 -313zM128 704q0 -185 131.5 -316.5t316.5 -131.5q117 0 218 57q-154 167 -154 391t154 391q-101 57 -218 57q-185 0 -316.5 -131.5t-131.5 -316.5zM1216 256q185 0 316.5 131.5t131.5 316.5 t-131.5 316.5t-316.5 131.5q-117 0 -218 -57q154 -167 154 -391t-154 -391q101 -57 218 -57z"/>
-   <glyph d="M1472 1408q26 0 45 -19t19 -45v-416q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v262l-213 -214l140 -140q9 -10 9 -23t-9 -22l-46 -46q-9 -9 -22 -9t-23 9l-140 141l-78 -79q126 -156 126 -359q0 -117 -45.5 -223.5t-123 -184t-184 -123t-223.5 -45.5t-223.5 45.5 t-184 123t-123 184t-45.5 223.5t45.5 223.5t123 184t184 123t223.5 45.5q203 0 359 -126l78 78l-172 172q-9 10 -9 23t9 22l46 46q9 9 22 9t23 -9l172 -172l213 213h-261q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h416zM576 0q185 0 316.5 131.5t131.5 316.5t-131.5 316.5 t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5z" unicode=""/>
-   <glyph horiz-adv-x="1280" unicode="" d="M640 892q217 -24 364.5 -187.5t147.5 -384.5q0 -167 -87 -306t-236 -212t-319 -54q-133 15 -245.5 88t-182 188t-80.5 249q-12 155 52.5 292t186 224t271.5 103v132h-160q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h160v165l-92 -92q-10 -9 -23 -9t-22 9l-46 46q-9 9 -9 22 t9 23l202 201q19 19 45 19t45 -19l202 -201q9 -10 9 -23t-9 -22l-46 -46q-9 -9 -22 -9t-23 9l-92 92v-165h160q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-160v-132zM576 -128q185 0 316.5 131.5t131.5 316.5t-131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5 t131.5 -316.5t316.5 -131.5z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1901 621q19 -19 19 -45t-19 -45l-294 -294q-9 -10 -22.5 -10t-22.5 10l-45 45q-10 9 -10 22.5t10 22.5l185 185h-294v-224q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v224h-132q-24 -217 -187.5 -364.5t-384.5 -147.5q-167 0 -306 87t-212 236t-54 319q15 133 88 245.5 t188 182t249 80.5q155 12 292 -52.5t224 -186t103 -271.5h132v224q0 14 9 23t23 9h64q14 0 23 -9t9 -23v-224h294l-185 185q-10 9 -10 22.5t10 22.5l45 45q9 10 22.5 10t22.5 -10zM576 128q185 0 316.5 131.5t131.5 316.5t-131.5 316.5t-316.5 131.5t-316.5 -131.5 t-131.5 -316.5t131.5 -316.5t316.5 -131.5z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1152 960q0 -221 -147.5 -384.5t-364.5 -187.5v-612q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v612q-217 24 -364.5 187.5t-147.5 384.5q0 117 45.5 223.5t123 184t184 123t223.5 45.5t223.5 -45.5t184 -123t123 -184t45.5 -223.5zM576 512q185 0 316.5 131.5 t131.5 316.5t-131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1024 576q0 185 -131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5t316.5 131.5t131.5 316.5zM1152 576q0 -117 -45.5 -223.5t-123 -184t-184 -123t-223.5 -45.5t-223.5 45.5t-184 123t-123 184t-45.5 223.5t45.5 223.5t123 184t184 123 t223.5 45.5t223.5 -45.5t184 -123t123 -184t45.5 -223.5z"/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph d="M1451 1408q35 0 60 -25t25 -60v-1366q0 -35 -25 -60t-60 -25h-391v595h199l30 232h-229v148q0 56 23.5 84t91.5 28l122 1v207q-63 9 -178 9q-136 0 -217.5 -80t-81.5 -226v-171h-200v-232h200v-595h-735q-35 0 -60 25t-25 60v1366q0 35 25 60t60 25h1366z" unicode=""/>
-   <glyph horiz-adv-x="1280" unicode="" d="M0 939q0 108 37.5 203.5t103.5 166.5t152 123t185 78t202 26q158 0 294 -66.5t221 -193.5t85 -287q0 -96 -19 -188t-60 -177t-100 -149.5t-145 -103t-189 -38.5q-68 0 -135 32t-96 88q-10 -39 -28 -112.5t-23.5 -95t-20.5 -71t-26 -71t-32 -62.5t-46 -77.5t-62 -86.5 l-14 -5l-9 10q-15 157 -15 188q0 92 21.5 206.5t66.5 287.5t52 203q-32 65 -32 169q0 83 52 156t132 73q61 0 95 -40.5t34 -102.5q0 -66 -44 -191t-44 -187q0 -63 45 -104.5t109 -41.5q55 0 102 25t78.5 68t56 95t38 110.5t20 111t6.5 99.5q0 173 -109.5 269.5t-285.5 96.5 q-200 0 -334 -129.5t-134 -328.5q0 -44 12.5 -85t27 -65t27 -45.5t12.5 -30.5q0 -28 -15 -73t-37 -45q-2 0 -17 3q-51 15 -90.5 56t-61 94.5t-32.5 108t-11 106.5z"/>
-   <glyph d="M985 562q13 0 97.5 -44t89.5 -53q2 -5 2 -15q0 -33 -17 -76q-16 -39 -71 -65.5t-102 -26.5q-57 0 -190 62q-98 45 -170 118t-148 185q-72 107 -71 194v8q3 91 74 158q24 22 52 22q6 0 18 -1.5t19 -1.5q19 0 26.5 -6.5t15.5 -27.5q8 -20 33 -88t25 -75q0 -21 -34.5 -57.5 t-34.5 -46.5q0 -7 5 -15q34 -73 102 -137q56 -53 151 -101q12 -7 22 -7q15 0 54 48.5t52 48.5zM782 32q127 0 243.5 50t200.5 134t134 200.5t50 243.5t-50 243.5t-134 200.5t-200.5 134t-243.5 50t-243.5 -50t-200.5 -134t-134 -200.5t-50 -243.5q0 -203 120 -368l-79 -233 l242 77q158 -104 345 -104zM782 1414q153 0 292.5 -60t240.5 -161t161 -240.5t60 -292.5t-60 -292.5t-161 -240.5t-240.5 -161t-292.5 -60q-195 0 -365 94l-417 -134l136 405q-108 178 -108 389q0 153 60 292.5t161 240.5t240.5 161t292.5 60z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M128 128h1024v128h-1024v-128zM128 640h1024v128h-1024v-128zM1696 192q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM128 1152h1024v128h-1024v-128zM1696 704q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM1696 1216 q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM1792 384v-384h-1792v384h1792zM1792 896v-384h-1792v384h1792zM1792 1408v-384h-1792v384h1792z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M704 640q-159 0 -271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5t271.5 -112.5t112.5 -271.5t-112.5 -271.5t-271.5 -112.5zM1664 512h352q13 0 22.5 -9.5t9.5 -22.5v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-352v-352q0 -13 -9.5 -22.5t-22.5 -9.5h-192q-13 0 -22.5 9.5 t-9.5 22.5v352h-352q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h352v352q0 13 9.5 22.5t22.5 9.5h192q13 0 22.5 -9.5t9.5 -22.5v-352zM928 288q0 -52 38 -90t90 -38h256v-238q-68 -50 -171 -50h-874q-121 0 -194 69t-73 190q0 53 3.5 103.5t14 109t26.5 108.5 t43 97.5t62 81t85.5 53.5t111.5 20q19 0 39 -17q79 -61 154.5 -91.5t164.5 -30.5t164.5 30.5t154.5 91.5q20 17 39 17q132 0 217 -96h-223q-52 0 -90 -38t-38 -90v-192z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M704 640q-159 0 -271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5t271.5 -112.5t112.5 -271.5t-112.5 -271.5t-271.5 -112.5zM1781 320l249 -249q9 -9 9 -23q0 -13 -9 -22l-136 -136q-9 -9 -22 -9q-14 0 -23 9l-249 249l-249 -249q-9 -9 -23 -9q-13 0 -22 9l-136 136 q-9 9 -9 22q0 14 9 23l249 249l-249 249q-9 9 -9 23q0 13 9 22l136 136q9 9 22 9q14 0 23 -9l249 -249l249 249q9 9 23 9q13 0 22 -9l136 -136q9 -9 9 -22q0 -14 -9 -23zM1283 320l-181 -181q-37 -37 -37 -91q0 -53 37 -90l83 -83q-21 -3 -44 -3h-874q-121 0 -194 69 t-73 190q0 53 3.5 103.5t14 109t26.5 108.5t43 97.5t62 81t85.5 53.5t111.5 20q19 0 39 -17q154 -122 319 -122t319 122q20 17 39 17q28 0 57 -6q-28 -27 -41 -50t-13 -56q0 -54 37 -91z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M256 512h1728q26 0 45 -19t19 -45v-448h-256v256h-1536v-256h-256v1216q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-704zM832 832q0 106 -75 181t-181 75t-181 -75t-75 -181t75 -181t181 -75t181 75t75 181zM2048 576v64q0 159 -112.5 271.5t-271.5 112.5h-704 q-26 0 -45 -19t-19 -45v-384h1152z"/>
-   <glyph d="M1536 1536l-192 -448h192v-192h-274l-55 -128h329v-192h-411l-357 -832l-357 832h-411v192h329l-55 128h-274v192h192l-192 448h256l323 -768h378l323 768h256zM768 320l108 256h-216z" unicode=""/>
-   <glyph d="M1088 1536q185 0 316.5 -93.5t131.5 -226.5v-896q0 -130 -125.5 -222t-305.5 -97l213 -202q16 -15 8 -35t-30 -20h-1056q-22 0 -30 20t8 35l213 202q-180 5 -305.5 97t-125.5 222v896q0 133 131.5 226.5t316.5 93.5h640zM768 192q80 0 136 56t56 136t-56 136t-136 56 t-136 -56t-56 -136t56 -136t136 -56zM1344 768v512h-1152v-512h1152z" unicode=""/>
-   <glyph d="M1088 1536q185 0 316.5 -93.5t131.5 -226.5v-896q0 -130 -125.5 -222t-305.5 -97l213 -202q16 -15 8 -35t-30 -20h-1056q-22 0 -30 20t8 35l213 202q-180 5 -305.5 97t-125.5 222v896q0 133 131.5 226.5t316.5 93.5h640zM288 224q66 0 113 47t47 113t-47 113t-113 47 t-113 -47t-47 -113t47 -113t113 -47zM704 768v512h-544v-512h544zM1248 224q66 0 113 47t47 113t-47 113t-113 47t-113 -47t-47 -113t47 -113t113 -47zM1408 768v512h-576v-512h576z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M597 1115v-1173q0 -25 -12.5 -42.5t-36.5 -17.5q-17 0 -33 8l-465 233q-21 10 -35.5 33.5t-14.5 46.5v1140q0 20 10 34t29 14q14 0 44 -15l511 -256q3 -3 3 -5zM661 1014l534 -866l-534 266v600zM1792 996v-1054q0 -25 -14 -40.5t-38 -15.5t-47 13l-441 220zM1789 1116 q0 -3 -256.5 -419.5t-300.5 -487.5l-390 634l324 527q17 28 52 28q14 0 26 -6l541 -270q4 -2 4 -6z"/>
-   <glyph d="M809 532l266 499h-112l-157 -312q-24 -48 -44 -92l-42 92l-155 312h-120l263 -493v-324h101v318zM1536 1408v-1536h-1536v1536h1536z" unicode=""/>
-   <glyph horiz-adv-x="2296" unicode="" d="M478 -139q-8 -16 -27 -34.5t-37 -25.5q-25 -9 -51.5 3.5t-28.5 31.5q-1 22 40 55t68 38q23 4 34 -21.5t2 -46.5zM1819 -139q7 -16 26 -34.5t38 -25.5q25 -9 51.5 3.5t27.5 31.5q2 22 -39.5 55t-68.5 38q-22 4 -33 -21.5t-2 -46.5zM1867 -30q13 -27 56.5 -59.5t77.5 -41.5 q45 -13 82 4.5t37 50.5q0 46 -67.5 100.5t-115.5 59.5q-40 5 -63.5 -37.5t-6.5 -76.5zM428 -30q-13 -27 -56 -59.5t-77 -41.5q-45 -13 -82 4.5t-37 50.5q0 46 67.5 100.5t115.5 59.5q40 5 63 -37.5t6 -76.5zM1158 1094h1q-41 0 -76 -15q27 -8 44 -30.5t17 -49.5 q0 -35 -27 -60t-65 -25q-52 0 -80 43q-5 -23 -5 -42q0 -74 56 -126.5t135 -52.5q80 0 136 52.5t56 126.5t-56 126.5t-136 52.5zM1462 1312q-99 109 -220.5 131.5t-245.5 -44.5q27 60 82.5 96.5t118 39.5t121.5 -17t99.5 -74.5t44.5 -131.5zM2212 73q8 -11 -11 -42 q7 -23 7 -40q1 -56 -44.5 -112.5t-109.5 -91.5t-118 -37q-48 -2 -92 21.5t-66 65.5q-687 -25 -1259 0q-23 -41 -66.5 -65t-92.5 -22q-86 3 -179.5 80.5t-92.5 160.5q2 22 7 40q-19 31 -11 42q6 10 31 1q14 22 41 51q-7 29 2 38q11 10 39 -4q29 20 59 34q0 29 13 37 q23 12 51 -16q35 5 61 -2q18 -4 38 -19v73q-11 0 -18 2q-53 10 -97 44.5t-55 87.5q-9 38 0 81q15 62 93 95q2 17 19 35.5t36 23.5t33 -7.5t19 -30.5h13q46 -5 60 -23q3 -3 5 -7q10 1 30.5 3.5t30.5 3.5q-15 11 -30 17q-23 40 -91 43q0 6 1 10q-62 2 -118.5 18.5t-84.5 47.5 q-32 36 -42.5 92t-2.5 112q16 126 90 179q23 16 52 4.5t32 -40.5q0 -1 1.5 -14t2.5 -21t3 -20t5.5 -19t8.5 -10q27 -14 76 -12q48 46 98 74q-40 4 -162 -14l47 46q61 58 163 111q145 73 282 86q-20 8 -41 15.5t-47 14t-42.5 10.5t-47.5 11t-43 10q595 126 904 -139 q98 -84 158 -222q85 -10 121 9h1q5 3 8.5 10t5.5 19t3 19.5t3 21.5l1 14q3 28 32 40t52 -5q73 -52 91 -178q7 -57 -3.5 -113t-42.5 -91q-28 -32 -83.5 -48.5t-115.5 -18.5v-10q-71 -2 -95 -43q-14 -5 -31 -17q11 -1 32 -3.5t30 -3.5q1 4 5 8q16 18 60 23h13q5 18 19 30t33 8 t36 -23t19 -36q79 -32 93 -95q9 -40 1 -81q-12 -53 -56 -88t-97 -44q-10 -2 -17 -2q0 -49 -1 -73q20 15 38 19q26 7 61 2q28 28 51 16q14 -9 14 -37q33 -16 59 -34q27 13 38 4q10 -10 2 -38q28 -30 41 -51q23 8 31 -1zM1937 1025q0 -29 -9 -54q82 -32 112 -132 q4 37 -9.5 98.5t-41.5 90.5q-20 19 -36 17t-16 -20zM1859 925q35 -42 47.5 -108.5t-0.5 -124.5q67 13 97 45q13 14 18 28q-3 64 -31 114.5t-79 66.5q-15 -15 -52 -21zM1822 921q-30 0 -44 1q42 -115 53 -239q21 0 43 3q16 68 1 135t-53 100zM258 839q30 100 112 132 q-9 25 -9 54q0 18 -16.5 20t-35.5 -17q-28 -29 -41.5 -90.5t-9.5 -98.5zM294 737q29 -31 97 -45q-13 58 -0.5 124.5t47.5 108.5v0q-37 6 -52 21q-51 -16 -78.5 -66t-31.5 -115q9 -17 18 -28zM471 683q14 124 73 235q-19 -4 -55 -18l-45 -19v1q-46 -89 -20 -196q25 -3 47 -3z M1434 644q8 -38 16.5 -108.5t11.5 -89.5q3 -18 9.5 -21.5t23.5 4.5q40 20 62 85.5t23 125.5q-24 2 -146 4zM1152 1285q-116 0 -199 -82.5t-83 -198.5q0 -117 83 -199.5t199 -82.5t199 82.5t83 199.5q0 116 -83 198.5t-199 82.5zM1380 646q-106 2 -211 0v1q-1 -27 2.5 -86 t13.5 -66q29 -14 93.5 -14.5t95.5 10.5q9 3 11 39t-0.5 69.5t-4.5 46.5zM1112 447q8 4 9.5 48t-0.5 88t-4 63v1q-212 -3 -214 -3q-4 -20 -7 -62t0 -83t14 -46q34 -15 101 -16t101 10zM718 636q-16 -59 4.5 -118.5t77.5 -84.5q15 -8 24 -5t12 21q3 16 8 90t10 103 q-69 -2 -136 -6zM591 510q3 -23 -34 -36q132 -141 271.5 -240t305.5 -154q172 49 310.5 146t293.5 250q-33 13 -30 34l3 9v1v-1q-17 2 -50 5.5t-48 4.5q-26 -90 -82 -132q-51 -38 -82 1q-5 6 -9 14q-7 13 -17 62q-2 -5 -5 -9t-7.5 -7t-8 -5.5t-9.5 -4l-10 -2.5t-12 -2 l-12 -1.5t-13.5 -1t-13.5 -0.5q-106 -9 -163 11q-4 -17 -10 -26.5t-21 -15t-23 -7t-36 -3.5q-2 0 -3 -0.5t-3 -0.5h-3q-179 -17 -203 40q-2 -63 -56 -54q-47 8 -91 54q-12 13 -20 26q-17 29 -26 65q-58 -6 -87 -10q1 -2 4 -10zM507 -118q3 14 3 30q-17 71 -51 130t-73 70 q-41 12 -101.5 -14.5t-104.5 -80t-39 -107.5q35 -53 100 -93t119 -42q51 -2 94 28t53 79zM510 53q23 -63 27 -119q195 113 392 174q-98 52 -180.5 120t-179.5 165q-6 -4 -29 -13q0 -2 -1 -5t-1 -4q31 -18 22 -37q-12 -23 -56 -34q-10 -13 -29 -24h-1q-2 -83 1 -150 q19 -34 35 -73zM579 -113q532 -21 1145 0q-254 147 -428 196q-76 -35 -156 -57q-8 -3 -16 0q-65 21 -129 49q-208 -60 -416 -188h-1v-1q1 0 1 1zM1763 -67q4 54 28 120q14 38 33 71l-1 -1q3 77 3 153q-15 8 -30 25q-42 9 -56 33q-9 20 22 38q-2 4 -2 9q-16 4 -28 12 q-204 -190 -383 -284q198 -59 414 -176zM2155 -90q5 54 -39 107.5t-104 80t-102 14.5q-38 -11 -72.5 -70.5t-51.5 -129.5q0 -16 3 -30q10 -49 53 -79t94 -28q54 2 119 42t100 93z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1524 -25q0 -68 -48 -116t-116 -48t-116.5 48t-48.5 116t48.5 116.5t116.5 48.5t116 -48.5t48 -116.5zM775 -25q0 -68 -48.5 -116t-116.5 -48t-116 48t-48 116t48 116.5t116 48.5t116.5 -48.5t48.5 -116.5zM0 1469q57 -60 110.5 -104.5t121 -82t136 -63t166 -45.5 t200 -31.5t250 -18.5t304 -9.5t372.5 -2.5q139 0 244.5 -5t181 -16.5t124 -27.5t71 -39.5t24 -51.5t-19.5 -64t-56.5 -76.5t-89.5 -91t-116 -104.5t-139 -119q-185 -157 -286 -247q29 51 76.5 109t94 105.5t94.5 98.5t83 91.5t54 80.5t13 70t-45.5 55.5t-116.5 41t-204 23.5 t-304 5q-168 -2 -314 6t-256 23t-204.5 41t-159.5 51.5t-122.5 62.5t-91.5 66.5t-68 71.5t-50.5 69.5t-40 68t-36.5 59.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M896 1472q-169 0 -323 -66t-265.5 -177.5t-177.5 -265.5t-66 -323t66 -323t177.5 -265.5t265.5 -177.5t323 -66t323 66t265.5 177.5t177.5 265.5t66 323t-66 323t-177.5 265.5t-265.5 177.5t-323 66zM896 1536q182 0 348 -71t286 -191t191 -286t71 -348t-71 -348 t-191 -286t-286 -191t-348 -71t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71zM496 704q16 0 16 -16v-480q0 -16 -16 -16h-32q-16 0 -16 16v480q0 16 16 16h32zM896 640q53 0 90.5 -37.5t37.5 -90.5q0 -35 -17.5 -64t-46.5 -46v-114q0 -14 -9 -23 t-23 -9h-64q-14 0 -23 9t-9 23v114q-29 17 -46.5 46t-17.5 64q0 53 37.5 90.5t90.5 37.5zM896 1408q209 0 385.5 -103t279.5 -279.5t103 -385.5t-103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103zM544 928v-96 q0 -14 9 -23t23 -9h64q14 0 23 9t9 23v96q0 93 65.5 158.5t158.5 65.5t158.5 -65.5t65.5 -158.5v-96q0 -14 9 -23t23 -9h64q14 0 23 9t9 23v96q0 146 -103 249t-249 103t-249 -103t-103 -249zM1408 192v512q0 26 -19 45t-45 19h-896q-26 0 -45 -19t-19 -45v-512 q0 -26 19 -45t45 -19h896q26 0 45 19t19 45z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1920 1024v-768h-1664v768h1664zM2048 448h128v384h-128v288q0 14 -9 23t-23 9h-1856q-14 0 -23 -9t-9 -23v-960q0 -14 9 -23t23 -9h1856q14 0 23 9t9 23v288zM2304 832v-384q0 -53 -37.5 -90.5t-90.5 -37.5v-160q0 -66 -47 -113t-113 -47h-1856q-66 0 -113 47t-47 113 v960q0 66 47 113t113 47h1856q66 0 113 -47t47 -113v-160q53 0 90.5 -37.5t37.5 -90.5z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M256 256v768h1280v-768h-1280zM2176 960q53 0 90.5 -37.5t37.5 -90.5v-384q0 -53 -37.5 -90.5t-90.5 -37.5v-160q0 -66 -47 -113t-113 -47h-1856q-66 0 -113 47t-47 113v960q0 66 47 113t113 47h1856q66 0 113 -47t47 -113v-160zM2176 448v384h-128v288q0 14 -9 23t-23 9 h-1856q-14 0 -23 -9t-9 -23v-960q0 -14 9 -23t23 -9h1856q14 0 23 9t9 23v288h128z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M256 256v768h896v-768h-896zM2176 960q53 0 90.5 -37.5t37.5 -90.5v-384q0 -53 -37.5 -90.5t-90.5 -37.5v-160q0 -66 -47 -113t-113 -47h-1856q-66 0 -113 47t-47 113v960q0 66 47 113t113 47h1856q66 0 113 -47t47 -113v-160zM2176 448v384h-128v288q0 14 -9 23t-23 9 h-1856q-14 0 -23 -9t-9 -23v-960q0 -14 9 -23t23 -9h1856q14 0 23 9t9 23v288h128z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M256 256v768h512v-768h-512zM2176 960q53 0 90.5 -37.5t37.5 -90.5v-384q0 -53 -37.5 -90.5t-90.5 -37.5v-160q0 -66 -47 -113t-113 -47h-1856q-66 0 -113 47t-47 113v960q0 66 47 113t113 47h1856q66 0 113 -47t47 -113v-160zM2176 448v384h-128v288q0 14 -9 23t-23 9 h-1856q-14 0 -23 -9t-9 -23v-960q0 -14 9 -23t23 -9h1856q14 0 23 9t9 23v288h128z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M2176 960q53 0 90.5 -37.5t37.5 -90.5v-384q0 -53 -37.5 -90.5t-90.5 -37.5v-160q0 -66 -47 -113t-113 -47h-1856q-66 0 -113 47t-47 113v960q0 66 47 113t113 47h1856q66 0 113 -47t47 -113v-160zM2176 448v384h-128v288q0 14 -9 23t-23 9h-1856q-14 0 -23 -9t-9 -23 v-960q0 -14 9 -23t23 -9h1856q14 0 23 9t9 23v288h128z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M1133 493q31 -30 14 -69q-17 -40 -59 -40h-382l201 -476q10 -25 0 -49t-34 -35l-177 -75q-25 -10 -49 0t-35 34l-191 452l-312 -312q-19 -19 -45 -19q-12 0 -24 5q-40 17 -40 59v1504q0 42 40 59q12 5 24 5q27 0 45 -19z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M832 1408q-320 0 -320 -224v-416h128v-128h-128v-544q0 -224 320 -224h64v-128h-64q-272 0 -384 146q-112 -146 -384 -146h-64v128h64q320 0 320 224v544h-128v128h128v416q0 224 -320 224h-64v128h64q272 0 384 -146q112 146 384 146h64v-128h-64z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M2048 1152h-128v-1024h128v-384h-384v128h-1280v-128h-384v384h128v1024h-128v384h384v-128h1280v128h384v-384zM1792 1408v-128h128v128h-128zM128 1408v-128h128v128h-128zM256 -128v128h-128v-128h128zM1664 0v128h128v1024h-128v128h-1280v-128h-128v-1024h128v-128 h1280zM1920 -128v128h-128v-128h128zM1280 896h384v-768h-896v256h-384v768h896v-256zM512 512h640v512h-640v-512zM1536 256v512h-256v-384h-384v-128h640z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M2304 768h-128v-640h128v-384h-384v128h-896v-128h-384v384h128v128h-384v-128h-384v384h128v640h-128v384h384v-128h896v128h384v-384h-128v-128h384v128h384v-384zM2048 1024v-128h128v128h-128zM1408 1408v-128h128v128h-128zM128 1408v-128h128v128h-128zM256 256 v128h-128v-128h128zM1536 384h-128v-128h128v128zM384 384h896v128h128v640h-128v128h-896v-128h-128v-640h128v-128zM896 -128v128h-128v-128h128zM2176 -128v128h-128v-128h128zM2048 128v640h-128v128h-384v-384h128v-384h-384v128h-384v-128h128v-128h896v128h128z"/>
-   <glyph d="M1024 288v-416h-928q-40 0 -68 28t-28 68v1344q0 40 28 68t68 28h1344q40 0 68 -28t28 -68v-928h-416q-40 0 -68 -28t-28 -68zM1152 256h381q-15 -82 -65 -132l-184 -184q-50 -50 -132 -65v381z" unicode=""/>
-   <glyph d="M1400 256h-248v-248q29 10 41 22l185 185q12 12 22 41zM1120 384h288v896h-1280v-1280h896v288q0 40 28 68t68 28zM1536 1312v-1024q0 -40 -20 -88t-48 -76l-184 -184q-28 -28 -76 -48t-88 -20h-1024q-40 0 -68 28t-28 68v1344q0 40 28 68t68 28h1344q40 0 68 -28t28 -68 z" unicode=""/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1951 538q0 -26 -15.5 -44.5t-38.5 -23.5q-8 -2 -18 -2h-153v140h153q10 0 18 -2q23 -5 38.5 -23.5t15.5 -44.5zM1933 751q0 -25 -15 -42t-38 -21q-3 -1 -15 -1h-139v129h139q3 0 8.5 -0.5t6.5 -0.5q23 -4 38 -21.5t15 -42.5zM728 587v308h-228v-308q0 -58 -38 -94.5 t-105 -36.5q-108 0 -229 59v-112q53 -15 121 -23t109 -9l42 -1q328 0 328 217zM1442 403v113q-99 -52 -200 -59q-108 -8 -169 41t-61 142t61 142t169 41q101 -7 200 -58v112q-48 12 -100 19.5t-80 9.5l-28 2q-127 6 -218.5 -14t-140.5 -60t-71 -88t-22 -106t22 -106t71 -88 t140.5 -60t218.5 -14q101 4 208 31zM2176 518q0 54 -43 88.5t-109 39.5v3q57 8 89 41.5t32 79.5q0 55 -41 88t-107 36q-3 0 -12 0.5t-14 0.5h-455v-510h491q74 0 121.5 36.5t47.5 96.5zM2304 1280v-1280q0 -52 -38 -90t-90 -38h-2048q-52 0 -90 38t-38 90v1280q0 52 38 90 t90 38h2048q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M858 295v693q-106 -41 -172 -135.5t-66 -211.5t66 -211.5t172 -134.5zM1362 641q0 117 -66 211.5t-172 135.5v-694q106 41 172 135.5t66 211.5zM1577 641q0 -159 -78.5 -294t-213.5 -213.5t-294 -78.5q-119 0 -227.5 46.5t-187 125t-125 187t-46.5 227.5q0 159 78.5 294 t213.5 213.5t294 78.5t294 -78.5t213.5 -213.5t78.5 -294zM1960 634q0 139 -55.5 261.5t-147.5 205.5t-213.5 131t-252.5 48h-301q-176 0 -323.5 -81t-235 -230t-87.5 -335q0 -171 87 -317.5t236 -231.5t323 -85h301q129 0 251.5 50.5t214.5 135t147.5 202.5t55.5 246z M2304 1280v-1280q0 -52 -38 -90t-90 -38h-2048q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h2048q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1664 -96v1088q0 13 -9.5 22.5t-22.5 9.5h-1088q-13 0 -22.5 -9.5t-9.5 -22.5v-1088q0 -13 9.5 -22.5t22.5 -9.5h1088q13 0 22.5 9.5t9.5 22.5zM1792 992v-1088q0 -66 -47 -113t-113 -47h-1088q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1088q66 0 113 -47t47 -113 zM1408 1376v-160h-128v160q0 13 -9.5 22.5t-22.5 9.5h-1088q-13 0 -22.5 -9.5t-9.5 -22.5v-1088q0 -13 9.5 -22.5t22.5 -9.5h160v-128h-160q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1088q66 0 113 -47t47 -113z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1728 1088l-384 -704h768zM448 1088l-384 -704h768zM1269 1280q-14 -40 -45.5 -71.5t-71.5 -45.5v-1291h608q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-1344q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h608v1291q-40 14 -71.5 45.5t-45.5 71.5h-491q-14 0 -23 9t-9 23v64 q0 14 9 23t23 9h491q21 57 70 92.5t111 35.5t111 -35.5t70 -92.5h491q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-491zM1088 1264q33 0 56.5 23.5t23.5 56.5t-23.5 56.5t-56.5 23.5t-56.5 -23.5t-23.5 -56.5t23.5 -56.5t56.5 -23.5zM2176 384q0 -73 -46.5 -131t-117.5 -91 t-144.5 -49.5t-139.5 -16.5t-139.5 16.5t-144.5 49.5t-117.5 91t-46.5 131q0 11 35 81t92 174.5t107 195.5t102 184t56 100q18 33 56 33t56 -33q4 -7 56 -100t102 -184t107 -195.5t92 -174.5t35 -81zM896 384q0 -73 -46.5 -131t-117.5 -91t-144.5 -49.5t-139.5 -16.5 t-139.5 16.5t-144.5 49.5t-117.5 91t-46.5 131q0 11 35 81t92 174.5t107 195.5t102 184t56 100q18 33 56 33t56 -33q4 -7 56 -100t102 -184t107 -195.5t92 -174.5t35 -81z"/>
-   <glyph d="M1408 1408q0 -261 -106.5 -461.5t-266.5 -306.5q160 -106 266.5 -306.5t106.5 -461.5h96q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-1472q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h96q0 261 106.5 461.5t266.5 306.5q-160 106 -266.5 306.5t-106.5 461.5h-96q-14 0 -23 9 t-9 23v64q0 14 9 23t23 9h1472q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-96zM874 700q77 29 149 92.5t129.5 152.5t92.5 210t35 253h-1024q0 -132 35 -253t92.5 -210t129.5 -152.5t149 -92.5q19 -7 30.5 -23.5t11.5 -36.5t-11.5 -36.5t-30.5 -23.5q-77 -29 -149 -92.5 t-129.5 -152.5t-92.5 -210t-35 -253h1024q0 132 -35 253t-92.5 210t-129.5 152.5t-149 92.5q-19 7 -30.5 23.5t-11.5 36.5t11.5 36.5t30.5 23.5z" unicode=""/>
-   <glyph d="M1408 1408q0 -261 -106.5 -461.5t-266.5 -306.5q160 -106 266.5 -306.5t106.5 -461.5h96q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-1472q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h96q0 261 106.5 461.5t266.5 306.5q-160 106 -266.5 306.5t-106.5 461.5h-96q-14 0 -23 9 t-9 23v64q0 14 9 23t23 9h1472q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-96zM1280 1408h-1024q0 -66 9 -128h1006q9 61 9 128zM1280 -128q0 130 -34 249.5t-90.5 208t-126.5 152t-146 94.5h-230q-76 -31 -146 -94.5t-126.5 -152t-90.5 -208t-34 -249.5h1024z" unicode=""/>
-   <glyph d="M1408 1408q0 -261 -106.5 -461.5t-266.5 -306.5q160 -106 266.5 -306.5t106.5 -461.5h96q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-1472q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h96q0 261 106.5 461.5t266.5 306.5q-160 106 -266.5 306.5t-106.5 461.5h-96q-14 0 -23 9 t-9 23v64q0 14 9 23t23 9h1472q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-96zM1280 1408h-1024q0 -206 85 -384h854q85 178 85 384zM1223 192q-54 141 -145.5 241.5t-194.5 142.5h-230q-103 -42 -194.5 -142.5t-145.5 -241.5h910z" unicode=""/>
-   <glyph d="M1408 1408q0 -261 -106.5 -461.5t-266.5 -306.5q160 -106 266.5 -306.5t106.5 -461.5h96q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-1472q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h96q0 261 106.5 461.5t266.5 306.5q-160 106 -266.5 306.5t-106.5 461.5h-96q-14 0 -23 9 t-9 23v64q0 14 9 23t23 9h1472q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-96zM874 700q77 29 149 92.5t129.5 152.5t92.5 210t35 253h-1024q0 -132 35 -253t92.5 -210t129.5 -152.5t149 -92.5q19 -7 30.5 -23.5t11.5 -36.5t-11.5 -36.5t-30.5 -23.5q-137 -51 -244 -196 h700q-107 145 -244 196q-19 7 -30.5 23.5t-11.5 36.5t11.5 36.5t30.5 23.5z" unicode=""/>
-   <glyph d="M1504 -64q14 0 23 -9t9 -23v-128q0 -14 -9 -23t-23 -9h-1472q-14 0 -23 9t-9 23v128q0 14 9 23t23 9h1472zM130 0q3 55 16 107t30 95t46 87t53.5 76t64.5 69.5t66 60t70.5 55t66.5 47.5t65 43q-43 28 -65 43t-66.5 47.5t-70.5 55t-66 60t-64.5 69.5t-53.5 76t-46 87 t-30 95t-16 107h1276q-3 -55 -16 -107t-30 -95t-46 -87t-53.5 -76t-64.5 -69.5t-66 -60t-70.5 -55t-66.5 -47.5t-65 -43q43 -28 65 -43t66.5 -47.5t70.5 -55t66 -60t64.5 -69.5t53.5 -76t46 -87t30 -95t16 -107h-1276zM1504 1536q14 0 23 -9t9 -23v-128q0 -14 -9 -23t-23 -9 h-1472q-14 0 -23 9t-9 23v128q0 14 9 23t23 9h1472z" unicode=""/>
-   <glyph d="M768 1152q-53 0 -90.5 -37.5t-37.5 -90.5v-128h-32v93q0 48 -32 81.5t-80 33.5q-46 0 -79 -33t-33 -79v-429l-32 30v172q0 48 -32 81.5t-80 33.5q-46 0 -79 -33t-33 -79v-224q0 -47 35 -82l310 -296q39 -39 39 -102q0 -26 19 -45t45 -19h640q26 0 45 19t19 45v25 q0 41 10 77l108 436q10 36 10 77v246q0 48 -32 81.5t-80 33.5q-46 0 -79 -33t-33 -79v-32h-32v125q0 40 -25 72.5t-64 40.5q-14 2 -23 2q-46 0 -79 -33t-33 -79v-128h-32v122q0 51 -32.5 89.5t-82.5 43.5q-5 1 -13 1zM768 1280q84 0 149 -50q57 34 123 34q59 0 111 -27 t86 -76q27 7 59 7q100 0 170 -71.5t70 -171.5v-246q0 -51 -13 -108l-109 -436q-6 -24 -6 -71q0 -80 -56 -136t-136 -56h-640q-84 0 -138 58.5t-54 142.5l-308 296q-76 73 -76 175v224q0 99 70.5 169.5t169.5 70.5q11 0 16 -1q6 95 75.5 160t164.5 65q52 0 98 -21 q72 69 174 69z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M880 1408q-46 0 -79 -33t-33 -79v-656h-32v528q0 46 -33 79t-79 33t-79 -33t-33 -79v-528v-256l-154 205q-38 51 -102 51q-53 0 -90.5 -37.5t-37.5 -90.5q0 -43 26 -77l384 -512q38 -51 102 -51h688q34 0 61 22t34 56l76 405q5 32 5 59v498q0 46 -33 79t-79 33t-79 -33 t-33 -79v-272h-32v528q0 46 -33 79t-79 33t-79 -33t-33 -79v-528h-32v656q0 46 -33 79t-79 33zM880 1536q68 0 125.5 -35.5t88.5 -96.5q19 4 42 4q99 0 169.5 -70.5t70.5 -169.5v-17q105 6 180.5 -64t75.5 -175v-498q0 -40 -8 -83l-76 -404q-14 -79 -76.5 -131t-143.5 -52 h-688q-60 0 -114.5 27.5t-90.5 74.5l-384 512q-51 68 -51 154q0 106 75 181t181 75q78 0 128 -34v434q0 99 70.5 169.5t169.5 70.5q23 0 42 -4q31 61 88.5 96.5t125.5 35.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1073 -128h-177q-163 0 -226 141q-23 49 -23 102v5q-62 30 -98.5 88.5t-36.5 127.5q0 38 5 48h-261q-106 0 -181 75t-75 181t75 181t181 75h113l-44 17q-74 28 -119.5 93.5t-45.5 145.5q0 106 75 181t181 75q46 0 91 -17l628 -239h401q106 0 181 -75t75 -181v-668 q0 -88 -54 -157.5t-140 -90.5l-339 -85q-92 -23 -186 -23zM1024 583l-155 -71l-163 -74q-30 -14 -48 -41.5t-18 -60.5q0 -46 33 -79t79 -33q26 0 46 10l338 154q-49 10 -80.5 50t-31.5 90v55zM1344 272q0 46 -33 79t-79 33q-26 0 -46 -10l-290 -132q-28 -13 -37 -17 t-30.5 -17t-29.5 -23.5t-16 -29t-8 -40.5q0 -50 31.5 -82t81.5 -32q20 0 38 9l352 160q30 14 48 41.5t18 60.5zM1112 1024l-650 248q-24 8 -46 8q-53 0 -90.5 -37.5t-37.5 -90.5q0 -40 22.5 -73t59.5 -47l526 -200v-64h-640q-53 0 -90.5 -37.5t-37.5 -90.5t37.5 -90.5 t90.5 -37.5h535l233 106v198q0 63 46 106l111 102h-69zM1073 0q82 0 155 19l339 85q43 11 70 45.5t27 78.5v668q0 53 -37.5 90.5t-90.5 37.5h-308l-136 -126q-36 -33 -36 -82v-296q0 -46 33 -77t79 -31t79 35t33 81v208h32v-208q0 -70 -57 -114q52 -8 86.5 -48.5t34.5 -93.5 q0 -42 -23 -78t-61 -53l-310 -141h91z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1151 1536q61 0 116 -28t91 -77l572 -781q118 -159 118 -359v-355q0 -80 -56 -136t-136 -56h-384q-80 0 -136 56t-56 136v177l-286 143h-546q-80 0 -136 56t-56 136v32q0 119 84.5 203.5t203.5 84.5h420l42 128h-686q-100 0 -173.5 67.5t-81.5 166.5q-65 79 -65 182v32 q0 80 56 136t136 56h959zM1920 -64v355q0 157 -93 284l-573 781q-39 52 -103 52h-959q-26 0 -45 -19t-19 -45q0 -32 1.5 -49.5t9.5 -40.5t25 -43q10 31 35.5 50t56.5 19h832v-32h-832q-26 0 -45 -19t-19 -45q0 -44 3 -58q8 -44 44 -73t81 -29h640h91q40 0 68 -28t28 -68 q0 -15 -5 -30l-64 -192q-10 -29 -35 -47.5t-56 -18.5h-443q-66 0 -113 -47t-47 -113v-32q0 -26 19 -45t45 -19h561q16 0 29 -7l317 -158q24 -13 38.5 -36t14.5 -50v-197q0 -26 19 -45t45 -19h384q26 0 45 19t19 45z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M816 1408q-48 0 -79.5 -34t-31.5 -82q0 -14 3 -28l150 -624h-26l-116 482q-9 38 -39.5 62t-69.5 24q-47 0 -79 -34t-32 -81q0 -11 4 -29q3 -13 39 -161t68 -282t32 -138v-227l-307 230q-34 26 -77 26q-52 0 -89.5 -36.5t-37.5 -88.5q0 -67 56 -110l507 -379 q34 -26 76 -26h694q33 0 59 20.5t34 52.5l100 401q8 30 10 88t9 86l116 478q3 12 3 26q0 46 -33 79t-80 33q-38 0 -69 -25.5t-40 -62.5l-99 -408h-26l132 547q3 14 3 28q0 47 -32 80t-80 33q-38 0 -68.5 -24t-39.5 -62l-145 -602h-127l-164 682q-9 38 -39.5 62t-68.5 24z M1461 -256h-694q-85 0 -153 51l-507 380q-50 38 -78.5 94t-28.5 118q0 105 75 179t180 74q25 0 49.5 -5.5t41.5 -11t41 -20.5t35 -23t38.5 -29.5t37.5 -28.5l-123 512q-7 35 -7 59q0 93 60 162t152 79q14 87 80.5 144.5t155.5 57.5q83 0 148 -51.5t85 -132.5l103 -428 l83 348q20 81 85 132.5t148 51.5q87 0 152.5 -54t82.5 -139q93 -10 155 -78t62 -161q0 -30 -7 -57l-116 -477q-5 -22 -5 -67q0 -51 -13 -108l-101 -401q-19 -75 -79.5 -122.5t-137.5 -47.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M640 1408q-53 0 -90.5 -37.5t-37.5 -90.5v-512v-384l-151 202q-41 54 -107 54q-52 0 -89 -38t-37 -90q0 -43 26 -77l384 -512q38 -51 102 -51h718q22 0 39.5 13.5t22.5 34.5l92 368q24 96 24 194v217q0 41 -28 71t-68 30t-68 -28t-28 -68h-32v61q0 48 -32 81.5t-80 33.5 q-46 0 -79 -33t-33 -79v-64h-32v90q0 55 -37 94.5t-91 39.5q-53 0 -90.5 -37.5t-37.5 -90.5v-96h-32v570q0 55 -37 94.5t-91 39.5zM640 1536q107 0 181.5 -77.5t74.5 -184.5v-220q22 2 32 2q99 0 173 -69q47 21 99 21q113 0 184 -87q27 7 56 7q94 0 159 -67.5t65 -161.5 v-217q0 -116 -28 -225l-92 -368q-16 -64 -68 -104.5t-118 -40.5h-718q-60 0 -114.5 27.5t-90.5 74.5l-384 512q-51 68 -51 154q0 105 74.5 180.5t179.5 75.5q71 0 130 -35v547q0 106 75 181t181 75zM768 128v384h-32v-384h32zM1024 128v384h-32v-384h32zM1280 128v384h-32 v-384h32z"/>
-   <glyph d="M1288 889q60 0 107 -23q141 -63 141 -226v-177q0 -94 -23 -186l-85 -339q-21 -86 -90.5 -140t-157.5 -54h-668q-106 0 -181 75t-75 181v401l-239 628q-17 45 -17 91q0 106 75 181t181 75q80 0 145.5 -45.5t93.5 -119.5l17 -44v113q0 106 75 181t181 75t181 -75t75 -181 v-261q27 5 48 5q69 0 127.5 -36.5t88.5 -98.5zM1072 896q-33 0 -60.5 -18t-41.5 -48l-74 -163l-71 -155h55q50 0 90 -31.5t50 -80.5l154 338q10 20 10 46q0 46 -33 79t-79 33zM1293 761q-22 0 -40.5 -8t-29 -16t-23.5 -29.5t-17 -30.5t-17 -37l-132 -290q-10 -20 -10 -46 q0 -46 33 -79t79 -33q33 0 60.5 18t41.5 48l160 352q9 18 9 38q0 50 -32 81.5t-82 31.5zM128 1120q0 -22 8 -46l248 -650v-69l102 111q43 46 106 46h198l106 233v535q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5v-640h-64l-200 526q-14 37 -47 59.5t-73 22.5 q-53 0 -90.5 -37.5t-37.5 -90.5zM1180 -128q44 0 78.5 27t45.5 70l85 339q19 73 19 155v91l-141 -310q-17 -38 -53 -61t-78 -23q-53 0 -93.5 34.5t-48.5 86.5q-44 -57 -114 -57h-208v32h208q46 0 81 33t35 79t-31 79t-77 33h-296q-49 0 -82 -36l-126 -136v-308 q0 -53 37.5 -90.5t90.5 -37.5h668z" unicode=""/>
-   <glyph horiz-adv-x="1973" unicode="" d="M857 992v-117q0 -13 -9.5 -22t-22.5 -9h-298v-812q0 -13 -9 -22.5t-22 -9.5h-135q-13 0 -22.5 9t-9.5 23v812h-297q-13 0 -22.5 9t-9.5 22v117q0 14 9 23t23 9h793q13 0 22.5 -9.5t9.5 -22.5zM1895 995l77 -961q1 -13 -8 -24q-10 -10 -23 -10h-134q-12 0 -21 8.5 t-10 20.5l-46 588l-189 -425q-8 -19 -29 -19h-120q-20 0 -29 19l-188 427l-45 -590q-1 -12 -10 -20.5t-21 -8.5h-135q-13 0 -23 10q-9 10 -9 24l78 961q1 12 10 20.5t21 8.5h142q20 0 29 -19l220 -520q10 -24 20 -51q3 7 9.5 24.5t10.5 26.5l221 520q9 19 29 19h141 q13 0 22 -8.5t10 -20.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1042 833q0 88 -60 121q-33 18 -117 18h-123v-281h162q66 0 102 37t36 105zM1094 548l205 -373q8 -17 -1 -31q-8 -16 -27 -16h-152q-20 0 -28 17l-194 365h-155v-350q0 -14 -9 -23t-23 -9h-134q-14 0 -23 9t-9 23v960q0 14 9 23t23 9h294q128 0 190 -24q85 -31 134 -109 t49 -180q0 -92 -42.5 -165.5t-115.5 -109.5q6 -10 9 -16zM896 1376q-150 0 -286 -58.5t-234.5 -157t-157 -234.5t-58.5 -286t58.5 -286t157 -234.5t234.5 -157t286 -58.5t286 58.5t234.5 157t157 234.5t58.5 286t-58.5 286t-157 234.5t-234.5 157t-286 58.5zM1792 640 q0 -182 -71 -348t-191 -286t-286 -191t-348 -71t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71t348 -71t286 -191t191 -286t71 -348z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M605 303q153 0 257 104q14 18 3 36l-45 82q-6 13 -24 17q-16 2 -27 -11l-4 -3q-4 -4 -11.5 -10t-17.5 -13t-23.5 -14.5t-28.5 -13.5t-33.5 -9.5t-37.5 -3.5q-76 0 -125 50t-49 127q0 76 48 125.5t122 49.5q37 0 71.5 -14t50.5 -28l16 -14q11 -11 26 -10q16 2 24 14l53 78 q13 20 -2 39q-3 4 -11 12t-30 23.5t-48.5 28t-67.5 22.5t-86 10q-148 0 -246 -96.5t-98 -240.5q0 -146 97 -241.5t247 -95.5zM1235 303q153 0 257 104q14 18 4 36l-45 82q-8 14 -25 17q-16 2 -27 -11l-4 -3q-4 -4 -11.5 -10t-17.5 -13t-23.5 -14.5t-28.5 -13.5t-33.5 -9.5 t-37.5 -3.5q-76 0 -125 50t-49 127q0 76 48 125.5t122 49.5q37 0 71.5 -14t50.5 -28l16 -14q11 -11 26 -10q16 2 24 14l53 78q13 20 -2 39q-3 4 -11 12t-30 23.5t-48.5 28t-67.5 22.5t-86 10q-147 0 -245.5 -96.5t-98.5 -240.5q0 -146 97 -241.5t247 -95.5zM896 1376 q-150 0 -286 -58.5t-234.5 -157t-157 -234.5t-58.5 -286t58.5 -286t157 -234.5t234.5 -157t286 -58.5t286 58.5t234.5 157t157 234.5t58.5 286t-58.5 286t-157 234.5t-234.5 157t-286 58.5zM896 1536q182 0 348 -71t286 -191t191 -286t71 -348t-71 -348t-191 -286t-286 -191 t-348 -71t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M736 736l384 -384l-384 -384l-672 672l672 672l168 -168l-96 -96l-72 72l-480 -480l480 -480l193 193l-289 287zM1312 1312l672 -672l-672 -672l-168 168l96 96l72 -72l480 480l-480 480l-193 -193l289 -287l-96 -96l-384 384z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M717 182l271 271l-279 279l-88 -88l192 -191l-96 -96l-279 279l279 279l40 -40l87 87l-127 128l-454 -454zM1075 190l454 454l-454 454l-271 -271l279 -279l88 88l-192 191l96 96l279 -279l-279 -279l-40 40l-87 -88zM1792 640q0 -182 -71 -348t-191 -286t-286 -191 t-348 -71t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71t348 -71t286 -191t191 -286t71 -348z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M651 539q0 -39 -27.5 -66.5t-65.5 -27.5q-39 0 -66.5 27.5t-27.5 66.5q0 38 27.5 65.5t66.5 27.5q38 0 65.5 -27.5t27.5 -65.5zM1805 540q0 -39 -27.5 -66.5t-66.5 -27.5t-66.5 27.5t-27.5 66.5t27.5 66t66.5 27t66.5 -27t27.5 -66zM765 539q0 79 -56.5 136t-136.5 57 t-136.5 -56.5t-56.5 -136.5t56.5 -136.5t136.5 -56.5t136.5 56.5t56.5 136.5zM1918 540q0 80 -56.5 136.5t-136.5 56.5q-79 0 -136 -56.5t-57 -136.5t56.5 -136.5t136.5 -56.5t136.5 56.5t56.5 136.5zM850 539q0 -116 -81.5 -197.5t-196.5 -81.5q-116 0 -197.5 82t-81.5 197 t82 196.5t197 81.5t196.5 -81.5t81.5 -196.5zM2004 540q0 -115 -81.5 -196.5t-197.5 -81.5q-115 0 -196.5 81.5t-81.5 196.5t81.5 196.5t196.5 81.5q116 0 197.5 -81.5t81.5 -196.5zM1040 537q0 191 -135.5 326.5t-326.5 135.5q-125 0 -231 -62t-168 -168.5t-62 -231.5 t62 -231.5t168 -168.5t231 -62q191 0 326.5 135.5t135.5 326.5zM1708 1110q-254 111 -556 111q-319 0 -573 -110q117 0 223 -45.5t182.5 -122.5t122 -183t45.5 -223q0 115 43.5 219.5t118 180.5t177.5 123t217 50zM2187 537q0 191 -135 326.5t-326 135.5t-326.5 -135.5 t-135.5 -326.5t135.5 -326.5t326.5 -135.5t326 135.5t135 326.5zM1921 1103h383q-44 -51 -75 -114.5t-40 -114.5q110 -151 110 -337q0 -156 -77 -288t-209 -208.5t-287 -76.5q-133 0 -249 56t-196 155q-47 -56 -129 -179q-11 22 -53.5 82.5t-74.5 97.5 q-80 -99 -196.5 -155.5t-249.5 -56.5q-155 0 -287 76.5t-209 208.5t-77 288q0 186 110 337q-9 51 -40 114.5t-75 114.5h365q149 100 355 156.5t432 56.5q224 0 421 -56t348 -157z"/>
-   <glyph horiz-adv-x="1280" unicode="" d="M640 629q-188 0 -321 133t-133 320q0 188 133 321t321 133t321 -133t133 -321q0 -187 -133 -320t-321 -133zM640 1306q-92 0 -157.5 -65.5t-65.5 -158.5q0 -92 65.5 -157.5t157.5 -65.5t157.5 65.5t65.5 157.5q0 93 -65.5 158.5t-157.5 65.5zM1163 574q13 -27 15 -49.5 t-4.5 -40.5t-26.5 -38.5t-42.5 -37t-61.5 -41.5q-115 -73 -315 -94l73 -72l267 -267q30 -31 30 -74t-30 -73l-12 -13q-31 -30 -74 -30t-74 30q-67 68 -267 268l-267 -268q-31 -30 -74 -30t-73 30l-12 13q-31 30 -31 73t31 74l267 267l72 72q-203 21 -317 94 q-39 25 -61.5 41.5t-42.5 37t-26.5 38.5t-4.5 40.5t15 49.5q10 20 28 35t42 22t56 -2t65 -35q5 -4 15 -11t43 -24.5t69 -30.5t92 -24t113 -11q91 0 174 25.5t120 50.5l38 25q33 26 65 35t56 2t42 -22t28 -35z"/>
-   <glyph d="M927 956q0 -66 -46.5 -112.5t-112.5 -46.5t-112.5 46.5t-46.5 112.5t46.5 112.5t112.5 46.5t112.5 -46.5t46.5 -112.5zM1141 593q-10 20 -28 32t-47.5 9.5t-60.5 -27.5q-10 -8 -29 -20t-81 -32t-127 -20t-124 18t-86 36l-27 18q-31 25 -60.5 27.5t-47.5 -9.5t-28 -32 q-22 -45 -2 -74.5t87 -73.5q83 -53 226 -67l-51 -52q-142 -142 -191 -190q-22 -22 -22 -52.5t22 -52.5l9 -9q22 -22 52.5 -22t52.5 22l191 191q114 -115 191 -191q22 -22 52.5 -22t52.5 22l9 9q22 22 22 52.5t-22 52.5l-191 190l-52 52q141 14 225 67q67 44 87 73.5t-2 74.5 zM1092 956q0 134 -95 229t-229 95t-229 -95t-95 -229t95 -229t229 -95t229 95t95 229zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" unicode=""/>
-   <glyph horiz-adv-x="1720" unicode="" d="M1565 1408q65 0 110 -45.5t45 -110.5v-519q0 -176 -68 -336t-182.5 -275t-274 -182.5t-334.5 -67.5q-176 0 -335.5 67.5t-274.5 182.5t-183 275t-68 336v519q0 64 46 110t110 46h1409zM861 344q47 0 82 33l404 388q37 35 37 85q0 49 -34.5 83.5t-83.5 34.5q-47 0 -82 -33 l-323 -310l-323 310q-35 33 -81 33q-49 0 -83.5 -34.5t-34.5 -83.5q0 -51 36 -85l405 -388q33 -33 81 -33z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1494 -103l-295 695q-25 -49 -158.5 -305.5t-198.5 -389.5q-1 -1 -27.5 -0.5t-26.5 1.5q-82 193 -255.5 587t-259.5 596q-21 50 -66.5 107.5t-103.5 100.5t-102 43q0 5 -0.5 24t-0.5 27h583v-50q-39 -2 -79.5 -16t-66.5 -43t-10 -64q26 -59 216.5 -499t235.5 -540 q31 61 140 266.5t131 247.5q-19 39 -126 281t-136 295q-38 69 -201 71v50l513 -1v-47q-60 -2 -93.5 -25t-12.5 -69q33 -70 87 -189.5t86 -187.5q110 214 173 363q24 55 -10 79.5t-129 26.5q1 7 1 25v24q64 0 170.5 0.5t180 1t92.5 0.5v-49q-62 -2 -119 -33t-90 -81 l-213 -442q13 -33 127.5 -290t121.5 -274l441 1017q-14 38 -49.5 62.5t-65 31.5t-55.5 8v50l460 -4l1 -2l-1 -44q-139 -4 -201 -145q-526 -1216 -559 -1291h-49z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M949 643q0 -26 -16.5 -45t-41.5 -19q-26 0 -45 16.5t-19 41.5q0 26 17 45t42 19t44 -16.5t19 -41.5zM964 585l350 581q-9 -8 -67.5 -62.5t-125.5 -116.5t-136.5 -127t-117 -110.5t-50.5 -51.5l-349 -580q7 7 67 62t126 116.5t136 127t117 111t50 50.5zM1611 640 q0 -201 -104 -371q-3 2 -17 11t-26.5 16.5t-16.5 7.5q-13 0 -13 -13q0 -10 59 -44q-74 -112 -184.5 -190.5t-241.5 -110.5l-16 67q-1 10 -15 10q-5 0 -8 -5.5t-2 -9.5l16 -68q-72 -15 -146 -15q-199 0 -372 105q1 2 13 20.5t21.5 33.5t9.5 19q0 13 -13 13q-6 0 -17 -14.5 t-22.5 -34.5t-13.5 -23q-113 75 -192 187.5t-110 244.5l69 15q10 3 10 15q0 5 -5.5 8t-10.5 2l-68 -15q-14 72 -14 139q0 206 109 379q2 -1 18.5 -12t30 -19t17.5 -8q13 0 13 12q0 6 -12.5 15.5t-32.5 21.5l-20 12q77 112 189 189t244 107l15 -67q2 -10 15 -10q5 0 8 5.5 t2 10.5l-15 66q71 13 134 13q204 0 379 -109q-39 -56 -39 -65q0 -13 12 -13q11 0 48 64q111 -75 187.5 -186t107.5 -241l-56 -12q-10 -2 -10 -16q0 -5 5.5 -8t9.5 -2l57 13q14 -72 14 -140zM1696 640q0 163 -63.5 311t-170.5 255t-255 170.5t-311 63.5t-311 -63.5 t-255 -170.5t-170.5 -255t-63.5 -311t63.5 -311t170.5 -255t255 -170.5t311 -63.5t311 63.5t255 170.5t170.5 255t63.5 311zM1792 640q0 -182 -71 -348t-191 -286t-286 -191t-348 -71t-348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71t348 -71t286 -191 t191 -286t71 -348z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M893 1536q240 2 451 -120q232 -134 352 -372l-742 39q-160 9 -294 -74.5t-185 -229.5l-276 424q128 159 311 245.5t383 87.5zM146 1131l337 -663q72 -143 211 -217t293 -45l-230 -451q-212 33 -385 157.5t-272.5 316t-99.5 411.5q0 267 146 491zM1732 962 q58 -150 59.5 -310.5t-48.5 -306t-153 -272t-246 -209.5q-230 -133 -498 -119l405 623q88 131 82.5 290.5t-106.5 277.5zM896 942q125 0 213.5 -88.5t88.5 -213.5t-88.5 -213.5t-213.5 -88.5t-213.5 88.5t-88.5 213.5t88.5 213.5t213.5 88.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M903 -256q-283 0 -504.5 150.5t-329.5 398.5q-58 131 -67 301t26 332.5t111 312t179 242.5l-11 -281q11 14 68 15.5t70 -15.5q42 81 160.5 138t234.5 59q-54 -45 -119.5 -148.5t-58.5 -163.5q25 -8 62.5 -13.5t63 -7.5t68 -4t50.5 -3q15 -5 9.5 -45.5t-30.5 -75.5 q-5 -7 -16.5 -18.5t-56.5 -35.5t-101 -34l15 -189l-139 67q-18 -43 -7.5 -81.5t36 -66.5t65.5 -41.5t81 -6.5q51 9 98 34.5t83.5 45t73.5 17.5q61 -4 89.5 -33t19.5 -65q-1 -2 -2.5 -5.5t-8.5 -12.5t-18 -15.5t-31.5 -10.5t-46.5 -1q-60 -95 -144.5 -135.5t-209.5 -29.5 q74 -61 162.5 -82.5t168.5 -6t154.5 52t128 87.5t80.5 104q43 91 39 192.5t-37.5 188.5t-78.5 125q87 -38 137 -79.5t77 -112.5q15 170 -57.5 343t-209.5 284q265 -77 412 -279.5t151 -517.5q2 -127 -40.5 -255t-123.5 -238t-189 -196t-247.5 -135.5t-288.5 -49.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1493 1308q-165 110 -359 110q-155 0 -293 -73t-240 -200q-75 -93 -119.5 -218t-48.5 -266v-42q4 -141 48.5 -266t119.5 -218q102 -127 240 -200t293 -73q194 0 359 110q-121 -108 -274.5 -168t-322.5 -60q-29 0 -43 1q-175 8 -333 82t-272 193t-181 281t-67 339 q0 182 71 348t191 286t286 191t348 71h3q168 -1 320.5 -60.5t273.5 -167.5zM1792 640q0 -192 -77 -362.5t-213 -296.5q-104 -63 -222 -63q-137 0 -255 84q154 56 253.5 233t99.5 405q0 227 -99 404t-253 234q119 83 254 83q119 0 226 -65q135 -125 210.5 -295t75.5 -361z "/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 599q0 -56 -7 -104h-1151q0 -146 109.5 -244.5t257.5 -98.5q99 0 185.5 46.5t136.5 130.5h423q-56 -159 -170.5 -281t-267.5 -188.5t-321 -66.5q-187 0 -356 83q-228 -116 -394 -116q-237 0 -237 263q0 115 45 275q17 60 109 229q199 360 475 606 q-184 -79 -427 -354q63 274 283.5 449.5t501.5 175.5q30 0 45 -1q255 117 433 117q64 0 116 -13t94.5 -40.5t66.5 -76.5t24 -115q0 -116 -75 -286q101 -182 101 -390zM1722 1239q0 83 -53 132t-137 49q-108 0 -254 -70q121 -47 222.5 -131.5t170.5 -195.5q51 135 51 216z M128 2q0 -86 48.5 -132.5t134.5 -46.5q115 0 266 83q-122 72 -213.5 183t-137.5 245q-98 -205 -98 -332zM632 715h728q-5 142 -113 237t-251 95q-144 0 -251.5 -95t-112.5 -237z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1792 288v960q0 13 -9.5 22.5t-22.5 9.5h-1600q-13 0 -22.5 -9.5t-9.5 -22.5v-960q0 -13 9.5 -22.5t22.5 -9.5h1600q13 0 22.5 9.5t9.5 22.5zM1920 1248v-960q0 -66 -47 -113t-113 -47h-736v-128h352q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-832q-14 0 -23 9t-9 23 v64q0 14 9 23t23 9h352v128h-736q-66 0 -113 47t-47 113v960q0 66 47 113t113 47h1600q66 0 113 -47t47 -113z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M138 1408h197q-70 -64 -126 -149q-36 -56 -59 -115t-30 -125.5t-8.5 -120t10.5 -132t21 -126t28 -136.5q4 -19 6 -28q51 -238 81 -329q57 -171 152 -275h-272q-48 0 -82 34t-34 82v1304q0 48 34 82t82 34zM1346 1408h308q48 0 82 -34t34 -82v-1304q0 -48 -34 -82t-82 -34 h-178q212 210 196 565l-469 -101q-2 -45 -12 -82t-31 -72t-59.5 -59.5t-93.5 -36.5q-123 -26 -199 40q-32 27 -53 61t-51.5 129t-64.5 258q-35 163 -45.5 263t-5.5 139t23 77q20 41 62.5 73t102.5 45q45 12 83.5 6.5t67 -17t54 -35t43 -48t34.5 -56.5l468 100 q-68 175 -180 287z"/>
-   <glyph d="M1401 -11l-6 -6q-113 -114 -259 -175q-154 -64 -317 -64q-165 0 -317 64q-148 63 -259 175q-113 112 -175 258q-42 103 -54 189q-4 28 48 36q51 8 56 -20q1 -1 1 -4q18 -90 46 -159q50 -124 152 -226q98 -98 226 -152q132 -56 276 -56q143 0 276 56q128 55 225 152l6 6 q10 10 25 6q12 -3 33 -22q36 -37 17 -58zM929 604l-66 -66l63 -63q21 -21 -7 -49q-17 -17 -32 -17q-10 0 -19 10l-62 61l-66 -66q-5 -5 -15 -5q-15 0 -31 16l-2 2q-18 15 -18 29q0 7 8 17l66 65l-66 66q-16 16 14 45q18 18 31 18q6 0 13 -5l65 -66l65 65q18 17 48 -13 q27 -27 11 -44zM1400 547q0 -118 -46 -228q-45 -105 -126 -186q-80 -80 -187 -126t-228 -46t-228 46t-187 126q-82 82 -125 186q-15 32 -15 40h-1q-9 27 43 44q50 16 60 -12q37 -99 97 -167h1v339v2q3 136 102 232q105 103 253 103q147 0 251 -103t104 -249 q0 -147 -104.5 -251t-250.5 -104q-58 0 -112 16q-28 11 -13 61q16 51 44 43l14 -3q14 -3 32.5 -6t30.5 -3q104 0 176 71.5t72 174.5q0 101 -72 171q-71 71 -175 71q-107 0 -178 -80q-64 -72 -64 -160v-413q110 -67 242 -67q96 0 185 36.5t156 103.5t103.5 155t36.5 183 q0 198 -141 339q-140 140 -339 140q-200 0 -340 -140q-53 -53 -77 -87l-2 -2q-8 -11 -13 -15.5t-21.5 -9.5t-38.5 3q-21 5 -36.5 16.5t-15.5 26.5v680q0 15 10.5 26.5t27.5 11.5h877q30 0 30 -55t-30 -55h-811v-483h1q40 42 102 84t108 61q109 46 231 46q121 0 228 -46 t187 -126q81 -81 126 -186q46 -112 46 -229zM1369 1128q9 -8 9 -18t-5.5 -18t-16.5 -21q-26 -26 -39 -26q-9 0 -16 7q-106 91 -207 133q-128 56 -276 56q-133 0 -262 -49q-27 -10 -45 37q-9 25 -8 38q3 16 16 20q130 57 299 57q164 0 316 -64q137 -58 235 -152z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1551 60q15 6 26 3t11 -17.5t-15 -33.5q-13 -16 -44 -43.5t-95.5 -68t-141 -74t-188 -58t-229.5 -24.5q-119 0 -238 31t-209 76.5t-172.5 104t-132.5 105t-84 87.5q-8 9 -10 16.5t1 12t8 7t11.5 2t11.5 -4.5q192 -117 300 -166q389 -176 799 -90q190 40 391 135z M1758 175q11 -16 2.5 -69.5t-28.5 -102.5q-34 -83 -85 -124q-17 -14 -26 -9t0 24q21 45 44.5 121.5t6.5 98.5q-5 7 -15.5 11.5t-27 6t-29.5 2.5t-35 0t-31.5 -2t-31 -3t-22.5 -2q-6 -1 -13 -1.5t-11 -1t-8.5 -1t-7 -0.5h-5.5h-4.5t-3 0.5t-2 1.5l-1.5 3q-6 16 47 40t103 30 q46 7 108 1t76 -24zM1364 618q0 -31 13.5 -64t32 -58t37.5 -46t33 -32l13 -11l-227 -224q-40 37 -79 75.5t-58 58.5l-19 20q-11 11 -25 33q-38 -59 -97.5 -102.5t-127.5 -63.5t-140 -23t-137.5 21t-117.5 65.5t-83 113t-31 162.5q0 84 28 154t72 116.5t106.5 83t122.5 57 t130 34.5t119.5 18.5t99.5 6.5v127q0 65 -21 97q-34 53 -121 53q-6 0 -16.5 -1t-40.5 -12t-56 -29.5t-56 -59.5t-48 -96l-294 27q0 60 22 119t67 113t108 95t151.5 65.5t190.5 24.5q100 0 181 -25t129.5 -61.5t81 -83t45 -86t12.5 -73.5v-589zM692 597q0 -86 70 -133 q66 -44 139 -22q84 25 114 123q14 45 14 101v162q-59 -2 -111 -12t-106.5 -33.5t-87 -71t-32.5 -114.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1536 1280q52 0 90 -38t38 -90v-1280q0 -52 -38 -90t-90 -38h-1408q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h128v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h384v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h128zM1152 1376v-288q0 -14 9 -23t23 -9 h64q14 0 23 9t9 23v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23zM384 1376v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23zM1536 -128v1024h-1408v-1024h1408zM896 448h224q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-224 v-224q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v224h-224q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h224v224q0 14 9 23t23 9h64q14 0 23 -9t9 -23v-224z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1152 416v-64q0 -14 -9 -23t-23 -9h-576q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h576q14 0 23 -9t9 -23zM128 -128h1408v1024h-1408v-1024zM512 1088v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1280 1088v288q0 14 -9 23 t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1664 1152v-1280q0 -52 -38 -90t-90 -38h-1408q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h128v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h384v96q0 66 47 113t113 47h64q66 0 113 -47 t47 -113v-96h128q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1111 151l-46 -46q-9 -9 -22 -9t-23 9l-188 189l-188 -189q-10 -9 -23 -9t-22 9l-46 46q-9 9 -9 22t9 23l189 188l-189 188q-9 10 -9 23t9 22l46 46q9 9 22 9t23 -9l188 -188l188 188q10 9 23 9t22 -9l46 -46q9 -9 9 -22t-9 -23l-188 -188l188 -188q9 -10 9 -23t-9 -22z M128 -128h1408v1024h-1408v-1024zM512 1088v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1280 1088v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1664 1152v-1280 q0 -52 -38 -90t-90 -38h-1408q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h128v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h384v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h128q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1303 572l-512 -512q-10 -9 -23 -9t-23 9l-288 288q-9 10 -9 23t9 22l46 46q9 9 22 9t23 -9l220 -220l444 444q10 9 23 9t22 -9l46 -46q9 -9 9 -22t-9 -23zM128 -128h1408v1024h-1408v-1024zM512 1088v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23 t23 -9h64q14 0 23 9t9 23zM1280 1088v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1664 1152v-1280q0 -52 -38 -90t-90 -38h-1408q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h128v96q0 66 47 113t113 47h64q66 0 113 -47 t47 -113v-96h384v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h128q52 0 90 -38t38 -90z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M448 1536q26 0 45 -19t19 -45v-891l536 429q17 14 40 14q26 0 45 -19t19 -45v-379l536 429q17 14 40 14q26 0 45 -19t19 -45v-1152q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v1664q0 26 19 45t45 19h384z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M512 448q66 0 128 15v-655q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v655q61 -15 128 -15zM512 1536q212 0 362 -150t150 -362t-150 -362t-362 -150t-362 150t-150 362t150 362t362 150zM512 1312q14 0 23 9t9 23t-9 23t-23 9q-146 0 -249 -103t-103 -249 q0 -14 9 -23t23 -9t23 9t9 23q0 119 84.5 203.5t203.5 84.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1745 1239q10 -10 10 -23t-10 -23l-141 -141q-28 -28 -68 -28h-1344q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h576v64q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-64h512q40 0 68 -28zM768 320h256v-512q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v512zM1600 768 q26 0 45 -19t19 -45v-256q0 -26 -19 -45t-45 -19h-1344q-40 0 -68 28l-141 141q-10 10 -10 23t10 23l141 141q28 28 68 28h512v192h256v-192h576z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M2020 1525q28 -20 28 -53v-1408q0 -20 -11 -36t-29 -23l-640 -256q-24 -11 -48 0l-616 246l-616 -246q-10 -5 -24 -5q-19 0 -36 11q-28 20 -28 53v1408q0 20 11 36t29 23l640 256q24 11 48 0l616 -246l616 246q32 13 60 -6zM736 1390v-1270l576 -230v1270zM128 1173 v-1270l544 217v1270zM1920 107v1270l-544 -217v-1270z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M512 1536q13 0 22.5 -9.5t9.5 -22.5v-1472q0 -20 -17 -28l-480 -256q-7 -4 -15 -4q-13 0 -22.5 9.5t-9.5 22.5v1472q0 20 17 28l480 256q7 4 15 4zM1760 1536q13 0 22.5 -9.5t9.5 -22.5v-1472q0 -20 -17 -28l-480 -256q-7 -4 -15 -4q-13 0 -22.5 9.5t-9.5 22.5v1472 q0 20 17 28l480 256q7 4 15 4zM640 1536q8 0 14 -3l512 -256q18 -10 18 -29v-1472q0 -13 -9.5 -22.5t-22.5 -9.5q-8 0 -14 3l-512 256q-18 10 -18 29v1472q0 13 9.5 22.5t22.5 9.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M640 640q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1024 640q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1408 640q0 53 -37.5 90.5t-90.5 37.5 t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1792 640q0 -174 -120 -321.5t-326 -233t-450 -85.5q-110 0 -211 18q-173 -173 -435 -229q-52 -10 -86 -13q-12 -1 -22 6t-13 18q-4 15 20 37q5 5 23.5 21.5t25.5 23.5t23.5 25.5t24 31.5t20.5 37 t20 48t14.5 57.5t12.5 72.5q-146 90 -229.5 216.5t-83.5 269.5q0 174 120 321.5t326 233t450 85.5t450 -85.5t326 -233t120 -321.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M640 640q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1024 640q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1408 640q0 -53 -37.5 -90.5t-90.5 -37.5 t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM896 1152q-204 0 -381.5 -69.5t-282 -187.5t-104.5 -255q0 -112 71.5 -213.5t201.5 -175.5l87 -50l-27 -96q-24 -91 -70 -172q152 63 275 171l43 38l57 -6q69 -8 130 -8q204 0 381.5 69.5t282 187.5 t104.5 255t-104.5 255t-282 187.5t-381.5 69.5zM1792 640q0 -174 -120 -321.5t-326 -233t-450 -85.5q-70 0 -145 8q-198 -175 -460 -242q-49 -14 -114 -22h-5q-15 0 -27 10.5t-16 27.5v1q-3 4 -0.5 12t2 10t4.5 9.5l6 9t7 8.5t8 9q7 8 31 34.5t34.5 38t31 39.5t32.5 51 t27 59t26 76q-157 89 -247.5 220t-90.5 281q0 130 71 248.5t191 204.5t286 136.5t348 50.5t348 -50.5t286 -136.5t191 -204.5t71 -248.5z"/>
-   <glyph horiz-adv-x="1024" unicode="" d="M512 345l512 295v-591l-512 -296v592zM0 640v-591l512 296zM512 1527v-591l-512 -296v591zM512 936l512 295v-591z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1709 1018q-10 -236 -332 -651q-333 -431 -562 -431q-142 0 -240 263q-44 160 -132 482q-72 262 -157 262q-18 0 -127 -76l-77 98q24 21 108 96.5t130 115.5q156 138 241 146q95 9 153 -55.5t81 -203.5q44 -287 66 -373q55 -249 120 -249q51 0 154 161q101 161 109 246 q13 139 -109 139q-57 0 -121 -26q120 393 459 382q251 -8 236 -326z"/>
-   <glyph d="M0 1408h1536v-1536h-1536v1536zM1085 293l-221 631l221 297h-634l221 -297l-221 -631l317 -304z" unicode=""/>
-   <glyph d="M0 1408h1536v-1536h-1536v1536zM908 1088l-12 -33l75 -83l-31 -114l25 -25l107 57l107 -57l25 25l-31 114l75 83l-12 33h-95l-53 96h-32l-53 -96h-95zM641 925q32 0 44.5 -16t11.5 -63l174 21q0 55 -17.5 92.5t-50.5 56t-69 25.5t-85 7q-133 0 -199 -57.5t-66 -182.5v-72 h-96v-128h76q20 0 20 -8v-382q0 -14 -5 -20t-18 -7l-73 -7v-88h448v86l-149 14q-6 1 -8.5 1.5t-3.5 2.5t-0.5 4t1 7t0.5 10v387h191l38 128h-231q-6 0 -2 6t4 9v80q0 27 1.5 40.5t7.5 28t19.5 20t36.5 5.5zM1248 96v86l-54 9q-7 1 -9.5 2.5t-2.5 3t1 7.5t1 12v520h-275 l-23 -101l83 -22q23 -7 23 -27v-370q0 -14 -6 -18.5t-20 -6.5l-70 -9v-86h352z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1792 690q0 -58 -29.5 -105.5t-79.5 -72.5q12 -46 12 -96q0 -155 -106.5 -287t-290.5 -208.5t-400 -76.5t-399.5 76.5t-290 208.5t-106.5 287q0 47 11 94q-51 25 -82 73.5t-31 106.5q0 82 58 140.5t141 58.5q85 0 145 -63q218 152 515 162l116 521q3 13 15 21t26 5 l369 -81q18 37 54 59.5t79 22.5q62 0 106 -43.5t44 -105.5t-44 -106t-106 -44t-105.5 43.5t-43.5 105.5l-334 74l-104 -472q300 -9 519 -160q58 61 143 61q83 0 141 -58.5t58 -140.5zM418 491q0 -62 43.5 -106t105.5 -44t106 44t44 106t-44 105.5t-106 43.5q-61 0 -105 -44 t-44 -105zM1228 136q11 11 11 26t-11 26q-10 10 -25 10t-26 -10q-41 -42 -121 -62t-160 -20t-160 20t-121 62q-11 10 -26 10t-25 -10q-11 -10 -11 -25.5t11 -26.5q43 -43 118.5 -68t122.5 -29.5t91 -4.5t91 4.5t122.5 29.5t118.5 68zM1225 341q62 0 105.5 44t43.5 106 q0 61 -44 105t-105 44q-62 0 -106 -43.5t-44 -105.5t44 -106t106 -44z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M69 741h1q16 126 58.5 241.5t115 217t167.5 176t223.5 117.5t276.5 43q231 0 414 -105.5t294 -303.5q104 -187 104 -442v-188h-1125q1 -111 53.5 -192.5t136.5 -122.5t189.5 -57t213 -3t208 46.5t173.5 84.5v-377q-92 -55 -229.5 -92t-312.5 -38t-316 53 q-189 73 -311.5 249t-124.5 372q-3 242 111 412t325 268q-48 -60 -78 -125.5t-46 -159.5h635q8 77 -8 140t-47 101.5t-70.5 66.5t-80.5 41t-75 20.5t-56 8.5l-22 1q-135 -5 -259.5 -44.5t-223.5 -104.5t-176 -140.5t-138 -163.5z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M0 32v608h2304v-608q0 -66 -47 -113t-113 -47h-1984q-66 0 -113 47t-47 113zM640 256v-128h384v128h-384zM256 256v-128h256v128h-256zM2144 1408q66 0 113 -47t47 -113v-224h-2304v224q0 66 47 113t113 47h1984z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1549 857q55 0 85.5 -28.5t30.5 -83.5t-34 -82t-91 -27h-136v-177h-25v398h170zM1710 267l-4 -11l-5 -10q-113 -230 -330.5 -366t-474.5 -136q-182 0 -348 71t-286 191t-191 286t-71 348t71 348t191 286t286 191t348 71q244 0 454.5 -124t329.5 -338l2 -4l8 -16 q-30 -15 -136.5 -68.5t-163.5 -84.5q-6 -3 -479 -268q384 -183 799 -366zM896 -234q250 0 462.5 132.5t322.5 357.5l-287 129q-72 -140 -206 -222t-292 -82q-151 0 -280 75t-204 204t-75 280t75 280t204 204t280 75t280 -73.5t204 -204.5l280 143q-116 208 -321 329 t-443 121q-119 0 -232.5 -31.5t-209 -87.5t-176.5 -137t-137 -176.5t-87.5 -209t-31.5 -232.5t31.5 -232.5t87.5 -209t137 -176.5t176.5 -137t209 -87.5t232.5 -31.5z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1427 827l-614 386l92 151h855zM405 562l-184 116v858l1183 -743zM1424 697l147 -95v-858l-532 335zM1387 718l-500 -802h-855l356 571z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M640 528v224q0 16 -16 16h-96q-16 0 -16 -16v-224q0 -16 16 -16h96q16 0 16 16zM1152 528v224q0 16 -16 16h-96q-16 0 -16 -16v-224q0 -16 16 -16h96q16 0 16 16zM1664 496v-752h-640v320q0 80 -56 136t-136 56t-136 -56t-56 -136v-320h-640v752q0 16 16 16h96 q16 0 16 -16v-112h128v624q0 16 16 16h96q16 0 16 -16v-112h128v112q0 16 16 16h96q16 0 16 -16v-112h128v112q0 16 16 16h16v393q-32 19 -32 55q0 26 19 45t45 19t45 -19t19 -45q0 -36 -32 -55v-9h272q16 0 16 -16v-224q0 -16 -16 -16h-272v-128h16q16 0 16 -16v-112h128 v112q0 16 16 16h96q16 0 16 -16v-112h128v112q0 16 16 16h96q16 0 16 -16v-624h128v112q0 16 16 16h96q16 0 16 -16z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M2288 731q16 -8 16 -27t-16 -27l-320 -192q-8 -5 -16 -5q-9 0 -16 4q-16 10 -16 28v128h-858q37 -58 83 -165q16 -37 24.5 -55t24 -49t27 -47t27 -34t31.5 -26t33 -8h96v96q0 14 9 23t23 9h320q14 0 23 -9t9 -23v-320q0 -14 -9 -23t-23 -9h-320q-14 0 -23 9t-9 23v96h-96 q-32 0 -61 10t-51 23.5t-45 40.5t-37 46t-33.5 57t-28.5 57.5t-28 60.5q-23 53 -37 81.5t-36 65t-44.5 53.5t-46.5 17h-360q-22 -84 -91 -138t-157 -54q-106 0 -181 75t-75 181t75 181t181 75q88 0 157 -54t91 -138h104q24 0 46.5 17t44.5 53.5t36 65t37 81.5q19 41 28 60.5 t28.5 57.5t33.5 57t37 46t45 40.5t51 23.5t61 10h107q21 57 70 92.5t111 35.5q80 0 136 -56t56 -136t-56 -136t-136 -56q-62 0 -111 35.5t-70 92.5h-107q-17 0 -33 -8t-31.5 -26t-27 -34t-27 -47t-24 -49t-24.5 -55q-46 -107 -83 -165h1114v128q0 18 16 28t32 -1z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1150 774q0 -56 -39.5 -95t-95.5 -39h-253v269h253q56 0 95.5 -39.5t39.5 -95.5zM1329 774q0 130 -91.5 222t-222.5 92h-433v-896h180v269h253q130 0 222 91.5t92 221.5zM1792 640q0 -182 -71 -348t-191 -286t-286 -191t-348 -71t-348 71t-286 191t-191 286t-71 348 t71 348t191 286t286 191t348 71t348 -71t286 -191t191 -286t71 -348z"/>
-   <glyph horiz-adv-x="2304" unicode="" d="M1645 438q0 59 -34 106.5t-87 68.5q-7 -45 -23 -92q-7 -24 -27.5 -38t-44.5 -14q-12 0 -24 3q-31 10 -45 38.5t-4 58.5q23 71 23 143q0 123 -61 227.5t-166 165.5t-228 61q-134 0 -247 -73t-167 -194q108 -28 188 -106q22 -23 22 -55t-22 -54t-54 -22t-55 22 q-75 75 -180 75q-106 0 -181 -74.5t-75 -180.5t75 -180.5t181 -74.5h1046q79 0 134.5 55.5t55.5 133.5zM1798 438q0 -142 -100.5 -242t-242.5 -100h-1046q-169 0 -289 119.5t-120 288.5q0 153 100 267t249 136q62 184 221 298t354 114q235 0 408.5 -158.5t196.5 -389.5 q116 -25 192.5 -118.5t76.5 -214.5zM2048 438q0 -175 -97 -319q-23 -33 -64 -33q-24 0 -43 13q-26 17 -32 48.5t12 57.5q71 104 71 233t-71 233q-18 26 -12 57t32 49t57.5 11.5t49.5 -32.5q97 -142 97 -318zM2304 438q0 -244 -134 -443q-23 -34 -64 -34q-23 0 -42 13 q-26 18 -32.5 49t11.5 57q108 164 108 358q0 195 -108 357q-18 26 -11.5 57.5t32.5 48.5q26 18 57 12t49 -33q134 -198 134 -442z"/>
-   <glyph d="M1500 -13q0 -89 -63 -152.5t-153 -63.5t-153.5 63.5t-63.5 152.5q0 90 63.5 153.5t153.5 63.5t153 -63.5t63 -153.5zM1267 268q-115 -15 -192.5 -102.5t-77.5 -205.5q0 -74 33 -138q-146 -78 -379 -78q-109 0 -201 21t-153.5 54.5t-110.5 76.5t-76 85t-44.5 83 t-23.5 66.5t-6 39.5q0 19 4.5 42.5t18.5 56t36.5 58t64 43.5t94.5 18t94 -17.5t63 -41t35.5 -53t17.5 -49t4 -33.5q0 -34 -23 -81q28 -27 82 -42t93 -17l40 -1q115 0 190 51t75 133q0 26 -9 48.5t-31.5 44.5t-49.5 41t-74 44t-93.5 47.5t-119.5 56.5q-28 13 -43 20 q-116 55 -187 100t-122.5 102t-72 125.5t-20.5 162.5q0 78 20.5 150t66 137.5t112.5 114t166.5 77t221.5 28.5q120 0 220 -26t164.5 -67t109.5 -94t64 -105.5t19 -103.5q0 -46 -15 -82.5t-36.5 -58t-48.5 -36t-49 -19.5t-39 -5h-8h-32t-39 5t-44 14t-41 28t-37 46t-24 70.5 t-10 97.5q-15 16 -59 25.5t-81 10.5l-37 1q-68 0 -117.5 -31t-70.5 -70t-21 -76q0 -24 5 -43t24 -46t53 -51t97 -53.5t150 -58.5q76 -25 138.5 -53.5t109 -55.5t83 -59t60.5 -59.5t41 -62.5t26.5 -62t14.5 -63.5t6 -62t1 -62.5z" unicode=""/>
-   <glyph d="M704 352v576q0 14 -9 23t-23 9h-256q-14 0 -23 -9t-9 -23v-576q0 -14 9 -23t23 -9h256q14 0 23 9t9 23zM1152 352v576q0 14 -9 23t-23 9h-256q-14 0 -23 -9t-9 -23v-576q0 -14 9 -23t23 -9h256q14 0 23 9t9 23zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103 t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" unicode=""/>
-   <glyph d="M768 1408q209 0 385.5 -103t279.5 -279.5t103 -385.5t-103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103zM768 96q148 0 273 73t198 198t73 273t-73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273 t73 -273t198 -198t273 -73zM864 320q-14 0 -23 9t-9 23v576q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-576q0 -14 -9 -23t-23 -9h-192zM480 320q-14 0 -23 9t-9 23v576q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-576q0 -14 -9 -23t-23 -9h-192z" unicode=""/>
-   <glyph d="M1088 352v576q0 14 -9 23t-23 9h-576q-14 0 -23 -9t-9 -23v-576q0 -14 9 -23t23 -9h576q14 0 23 9t9 23zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5 t103 -385.5z" unicode=""/>
-   <glyph d="M768 1408q209 0 385.5 -103t279.5 -279.5t103 -385.5t-103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103zM768 96q148 0 273 73t198 198t73 273t-73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273 t73 -273t198 -198t273 -73zM480 320q-14 0 -23 9t-9 23v576q0 14 9 23t23 9h576q14 0 23 -9t9 -23v-576q0 -14 -9 -23t-23 -9h-576z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode="" d="M1757 128l35 -313q3 -28 -16 -50q-19 -21 -48 -21h-1664q-29 0 -48 21q-19 22 -16 50l35 313h1722zM1664 967l86 -775h-1708l86 775q3 24 21 40.5t43 16.5h256v-128q0 -53 37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5v128h384v-128q0 -53 37.5 -90.5t90.5 -37.5 t90.5 37.5t37.5 90.5v128h256q25 0 43 -16.5t21 -40.5zM1280 1152v-256q0 -26 -19 -45t-45 -19t-45 19t-19 45v256q0 106 -75 181t-181 75t-181 -75t-75 -181v-256q0 -26 -19 -45t-45 -19t-45 19t-19 45v256q0 159 112.5 271.5t271.5 112.5t271.5 -112.5t112.5 -271.5z"/>
-   <glyph horiz-adv-x="2048" unicode="" d="M1920 768q53 0 90.5 -37.5t37.5 -90.5t-37.5 -90.5t-90.5 -37.5h-15l-115 -662q-8 -46 -44 -76t-82 -30h-1280q-46 0 -82 30t-44 76l-115 662h-15q-53 0 -90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5h1792zM485 -32q26 2 43.5 22.5t15.5 46.5l-32 416q-2 26 -22.5 43.5 t-46.5 15.5t-43.5 -22.5t-15.5 -46.5l32 -416q2 -25 20.5 -42t43.5 -17h5zM896 32v416q0 26 -19 45t-45 19t-45 -19t-19 -45v-416q0 -26 19 -45t45 -19t45 19t19 45zM1280 32v416q0 26 -19 45t-45 19t-45 -19t-19 -45v-416q0 -26 19 -45t45 -19t45 19t19 45zM1632 27l32 416 q2 26 -15.5 46.5t-43.5 22.5t-46.5 -15.5t-22.5 -43.5l-32 -416q-2 -26 15.5 -46.5t43.5 -22.5h5q25 0 43.5 17t20.5 42zM476 1244l-93 -412h-132l101 441q19 88 89 143.5t160 55.5h167q0 26 19 45t45 19h384q26 0 45 -19t19 -45h167q90 0 160 -55.5t89 -143.5l101 -441 h-132l-93 412q-11 44 -45.5 72t-79.5 28h-167q0 -26 -19 -45t-45 -19h-384q-26 0 -45 19t-19 45h-167q-45 0 -79.5 -28t-45.5 -72z"/>
-   <glyph horiz-adv-x="1792" unicode="" d="M991 512l64 256h-254l-64 -256h254zM1759 1016l-56 -224q-7 -24 -31 -24h-327l-64 -256h311q15 0 25 -12q10 -14 6 -28l-56 -224q-5 -24 -31 -24h-327l-81 -328q-7 -24 -31 -24h-224q-16 0 -26 12q-9 12 -6 28l78 312h-254l-81 -328q-7 -24 -31 -24h-225q-15 0 -25 12 q-9 12 -6 28l78 312h-311q-15 0 -25 12q-9 12 -6 28l56 224q7 24 31 24h327l64 256h-311q-15 0 -25 12q-10 14 -6 28l56 224q5 24 31 24h327l81 328q7 24 32 24h224q15 0 25 -12q9 -12 6 -28l-78 -312h254l81 328q7 24 32 24h224q15 0 25 -12q9 -12 6 -28l-78 -312h311 q15 0 25 -12q9 -12 6 -28z"/>
-   <glyph d="M841 483l148 -148l-149 -149zM840 1094l149 -149l-148 -148zM710 -130l464 464l-306 306l306 306l-464 464v-611l-255 255l-93 -93l320 -321l-320 -321l93 -93l255 255v-611zM1429 640q0 -209 -32 -365.5t-87.5 -257t-140.5 -162.5t-181.5 -86.5t-219.5 -24.5 t-219.5 24.5t-181.5 86.5t-140.5 162.5t-87.5 257t-32 365.5t32 365.5t87.5 257t140.5 162.5t181.5 86.5t219.5 24.5t219.5 -24.5t181.5 -86.5t140.5 -162.5t87.5 -257t32 -365.5z" unicode=""/>
-   <glyph horiz-adv-x="1024" unicode="" d="M596 113l173 172l-173 172v-344zM596 823l173 172l-173 172v-344zM628 640l356 -356l-539 -540v711l-297 -296l-108 108l372 373l-372 373l108 108l297 -296v711l539 -540z"/>
-   <glyph d="M1280 256q0 52 -38 90t-90 38t-90 -38t-38 -90t38 -90t90 -38t90 38t38 90zM512 1024q0 52 -38 90t-90 38t-90 -38t-38 -90t38 -90t90 -38t90 38t38 90zM1536 256q0 -159 -112.5 -271.5t-271.5 -112.5t-271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5t271.5 -112.5 t112.5 -271.5zM1440 1344q0 -20 -13 -38l-1056 -1408q-19 -26 -51 -26h-160q-26 0 -45 19t-19 45q0 20 13 38l1056 1408q19 26 51 26h160q26 0 45 -19t19 -45zM768 1024q0 -159 -112.5 -271.5t-271.5 -112.5t-271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5 t271.5 -112.5t112.5 -271.5z" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-   <glyph horiz-adv-x="1792" unicode=""/>
-  </font>
- </defs>
-</svg>
+<metadata></metadata>
+<defs>
+<font id="fontawesomeregular" horiz-adv-x="1536" >
+<font-face units-per-em="1792" ascent="1536" descent="-256" />
+<missing-glyph horiz-adv-x="448" />
+<glyph unicode=" "  horiz-adv-x="448" />
+<glyph unicode="&#x09;" horiz-adv-x="448" />
+<glyph unicode="&#xa0;" horiz-adv-x="448" />
+<glyph unicode="&#xa8;" horiz-adv-x="1792" />
+<glyph unicode="&#xa9;" horiz-adv-x="1792" />
+<glyph unicode="&#xae;" horiz-adv-x="1792" />
+<glyph unicode="&#xb4;" horiz-adv-x="1792" />
+<glyph unicode="&#xc6;" horiz-adv-x="1792" />
+<glyph unicode="&#x2000;" horiz-adv-x="768" />
+<glyph unicode="&#x2001;" />
+<glyph unicode="&#x2002;" horiz-adv-x="768" />
+<glyph unicode="&#x2003;" />
+<glyph unicode="&#x2004;" horiz-adv-x="512" />
+<glyph unicode="&#x2005;" horiz-adv-x="384" />
+<glyph unicode="&#x2006;" horiz-adv-x="256" />
+<glyph unicode="&#x2007;" horiz-adv-x="256" />
+<glyph unicode="&#x2008;" horiz-adv-x="192" />
+<glyph unicode="&#x2009;" horiz-adv-x="307" />
+<glyph unicode="&#x200a;" horiz-adv-x="85" />
+<glyph unicode="&#x202f;" horiz-adv-x="307" />
+<glyph unicode="&#x205f;" horiz-adv-x="384" />
+<glyph unicode="&#x2122;" horiz-adv-x="1792" />
+<glyph unicode="&#x221e;" horiz-adv-x="1792" />
+<glyph unicode="&#x2260;" horiz-adv-x="1792" />
+<glyph unicode="&#xe000;" horiz-adv-x="500" d="M0 0z" />
+<glyph unicode="&#xf000;" horiz-adv-x="1792" d="M1699 1350q0 -35 -43 -78l-632 -632v-768h320q26 0 45 -19t19 -45t-19 -45t-45 -19h-896q-26 0 -45 19t-19 45t19 45t45 19h320v768l-632 632q-43 43 -43 78q0 23 18 36.5t38 17.5t43 4h1408q23 0 43 -4t38 -17.5t18 -36.5z" />
+<glyph unicode="&#xf001;" d="M1536 1312v-1120q0 -50 -34 -89t-86 -60.5t-103.5 -32t-96.5 -10.5t-96.5 10.5t-103.5 32t-86 60.5t-34 89t34 89t86 60.5t103.5 32t96.5 10.5q105 0 192 -39v537l-768 -237v-709q0 -50 -34 -89t-86 -60.5t-103.5 -32t-96.5 -10.5t-96.5 10.5t-103.5 32t-86 60.5t-34 89 t34 89t86 60.5t103.5 32t96.5 10.5q105 0 192 -39v967q0 31 19 56.5t49 35.5l832 256q12 4 28 4q40 0 68 -28t28 -68z" />
+<glyph unicode="&#xf002;" horiz-adv-x="1664" d="M1152 704q0 185 -131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5t316.5 131.5t131.5 316.5zM1664 -128q0 -52 -38 -90t-90 -38q-54 0 -90 38l-343 342q-179 -124 -399 -124q-143 0 -273.5 55.5t-225 150t-150 225t-55.5 273.5 t55.5 273.5t150 225t225 150t273.5 55.5t273.5 -55.5t225 -150t150 -225t55.5 -273.5q0 -220 -124 -399l343 -343q37 -37 37 -90z" />
+<glyph unicode="&#xf003;" horiz-adv-x="1792" d="M1664 32v768q-32 -36 -69 -66q-268 -206 -426 -338q-51 -43 -83 -67t-86.5 -48.5t-102.5 -24.5h-1h-1q-48 0 -102.5 24.5t-86.5 48.5t-83 67q-158 132 -426 338q-37 30 -69 66v-768q0 -13 9.5 -22.5t22.5 -9.5h1472q13 0 22.5 9.5t9.5 22.5zM1664 1083v11v13.5t-0.5 13 t-3 12.5t-5.5 9t-9 7.5t-14 2.5h-1472q-13 0 -22.5 -9.5t-9.5 -22.5q0 -168 147 -284q193 -152 401 -317q6 -5 35 -29.5t46 -37.5t44.5 -31.5t50.5 -27.5t43 -9h1h1q20 0 43 9t50.5 27.5t44.5 31.5t46 37.5t35 29.5q208 165 401 317q54 43 100.5 115.5t46.5 131.5z M1792 1120v-1088q0 -66 -47 -113t-113 -47h-1472q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1472q66 0 113 -47t47 -113z" />
+<glyph unicode="&#xf004;" horiz-adv-x="1792" d="M896 -128q-26 0 -44 18l-624 602q-10 8 -27.5 26t-55.5 65.5t-68 97.5t-53.5 121t-23.5 138q0 220 127 344t351 124q62 0 126.5 -21.5t120 -58t95.5 -68.5t76 -68q36 36 76 68t95.5 68.5t120 58t126.5 21.5q224 0 351 -124t127 -344q0 -221 -229 -450l-623 -600 q-18 -18 -44 -18z" />
+<glyph unicode="&#xf005;" horiz-adv-x="1664" d="M1664 889q0 -22 -26 -48l-363 -354l86 -500q1 -7 1 -20q0 -21 -10.5 -35.5t-30.5 -14.5q-19 0 -40 12l-449 236l-449 -236q-22 -12 -40 -12q-21 0 -31.5 14.5t-10.5 35.5q0 6 2 20l86 500l-364 354q-25 27 -25 48q0 37 56 46l502 73l225 455q19 41 49 41t49 -41l225 -455 l502 -73q56 -9 56 -46z" />
+<glyph unicode="&#xf006;" horiz-adv-x="1664" d="M1137 532l306 297l-422 62l-189 382l-189 -382l-422 -62l306 -297l-73 -421l378 199l377 -199zM1664 889q0 -22 -26 -48l-363 -354l86 -500q1 -7 1 -20q0 -50 -41 -50q-19 0 -40 12l-449 236l-449 -236q-22 -12 -40 -12q-21 0 -31.5 14.5t-10.5 35.5q0 6 2 20l86 500 l-364 354q-25 27 -25 48q0 37 56 46l502 73l225 455q19 41 49 41t49 -41l225 -455l502 -73q56 -9 56 -46z" />
+<glyph unicode="&#xf007;" horiz-adv-x="1408" d="M1408 131q0 -120 -73 -189.5t-194 -69.5h-874q-121 0 -194 69.5t-73 189.5q0 53 3.5 103.5t14 109t26.5 108.5t43 97.5t62 81t85.5 53.5t111.5 20q9 0 42 -21.5t74.5 -48t108 -48t133.5 -21.5t133.5 21.5t108 48t74.5 48t42 21.5q61 0 111.5 -20t85.5 -53.5t62 -81 t43 -97.5t26.5 -108.5t14 -109t3.5 -103.5zM1088 1024q0 -159 -112.5 -271.5t-271.5 -112.5t-271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5t271.5 -112.5t112.5 -271.5z" />
+<glyph unicode="&#xf008;" horiz-adv-x="1920" d="M384 -64v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM384 320v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM384 704v128q0 26 -19 45t-45 19h-128 q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1408 -64v512q0 26 -19 45t-45 19h-768q-26 0 -45 -19t-19 -45v-512q0 -26 19 -45t45 -19h768q26 0 45 19t19 45zM384 1088v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45 t45 -19h128q26 0 45 19t19 45zM1792 -64v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1408 704v512q0 26 -19 45t-45 19h-768q-26 0 -45 -19t-19 -45v-512q0 -26 19 -45t45 -19h768q26 0 45 19t19 45zM1792 320v128 q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1792 704v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1792 1088v128q0 26 -19 45t-45 19h-128q-26 0 -45 -19 t-19 -45v-128q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1920 1248v-1344q0 -66 -47 -113t-113 -47h-1600q-66 0 -113 47t-47 113v1344q0 66 47 113t113 47h1600q66 0 113 -47t47 -113z" />
+<glyph unicode="&#xf009;" horiz-adv-x="1664" d="M768 512v-384q0 -52 -38 -90t-90 -38h-512q-52 0 -90 38t-38 90v384q0 52 38 90t90 38h512q52 0 90 -38t38 -90zM768 1280v-384q0 -52 -38 -90t-90 -38h-512q-52 0 -90 38t-38 90v384q0 52 38 90t90 38h512q52 0 90 -38t38 -90zM1664 512v-384q0 -52 -38 -90t-90 -38 h-512q-52 0 -90 38t-38 90v384q0 52 38 90t90 38h512q52 0 90 -38t38 -90zM1664 1280v-384q0 -52 -38 -90t-90 -38h-512q-52 0 -90 38t-38 90v384q0 52 38 90t90 38h512q52 0 90 -38t38 -90z" />
+<glyph unicode="&#xf00a;" horiz-adv-x="1792" d="M512 288v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM512 800v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1152 288v-192q0 -40 -28 -68t-68 -28h-320 q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM512 1312v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1152 800v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28 h320q40 0 68 -28t28 -68zM1792 288v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1152 1312v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1792 800v-192 q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1792 1312v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68z" />
+<glyph unicode="&#xf00b;" horiz-adv-x="1792" d="M512 288v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM512 800v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1792 288v-192q0 -40 -28 -68t-68 -28h-960 q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h960q40 0 68 -28t28 -68zM512 1312v-192q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h320q40 0 68 -28t28 -68zM1792 800v-192q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v192q0 40 28 68t68 28 h960q40 0 68 -28t28 -68zM1792 1312v-192q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h960q40 0 68 -28t28 -68z" />
+<glyph unicode="&#xf00c;" horiz-adv-x="1792" d="M1671 970q0 -40 -28 -68l-724 -724l-136 -136q-28 -28 -68 -28t-68 28l-136 136l-362 362q-28 28 -28 68t28 68l136 136q28 28 68 28t68 -28l294 -295l656 657q28 28 68 28t68 -28l136 -136q28 -28 28 -68z" />
+<glyph unicode="&#xf00d;" horiz-adv-x="1408" d="M1298 214q0 -40 -28 -68l-136 -136q-28 -28 -68 -28t-68 28l-294 294l-294 -294q-28 -28 -68 -28t-68 28l-136 136q-28 28 -28 68t28 68l294 294l-294 294q-28 28 -28 68t28 68l136 136q28 28 68 28t68 -28l294 -294l294 294q28 28 68 28t68 -28l136 -136q28 -28 28 -68 t-28 -68l-294 -294l294 -294q28 -28 28 -68z" />
+<glyph unicode="&#xf00e;" horiz-adv-x="1664" d="M1024 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-224v-224q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v224h-224q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h224v224q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5v-224h224 q13 0 22.5 -9.5t9.5 -22.5zM1152 704q0 185 -131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5t316.5 131.5t131.5 316.5zM1664 -128q0 -53 -37.5 -90.5t-90.5 -37.5q-54 0 -90 38l-343 342q-179 -124 -399 -124q-143 0 -273.5 55.5 t-225 150t-150 225t-55.5 273.5t55.5 273.5t150 225t225 150t273.5 55.5t273.5 -55.5t225 -150t150 -225t55.5 -273.5q0 -220 -124 -399l343 -343q37 -37 37 -90z" />
+<glyph unicode="&#xf010;" horiz-adv-x="1664" d="M1024 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-576q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h576q13 0 22.5 -9.5t9.5 -22.5zM1152 704q0 185 -131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5t316.5 131.5t131.5 316.5z M1664 -128q0 -53 -37.5 -90.5t-90.5 -37.5q-54 0 -90 38l-343 342q-179 -124 -399 -124q-143 0 -273.5 55.5t-225 150t-150 225t-55.5 273.5t55.5 273.5t150 225t225 150t273.5 55.5t273.5 -55.5t225 -150t150 -225t55.5 -273.5q0 -220 -124 -399l343 -343q37 -37 37 -90z " />
+<glyph unicode="&#xf011;" d="M1536 640q0 -156 -61 -298t-164 -245t-245 -164t-298 -61t-298 61t-245 164t-164 245t-61 298q0 182 80.5 343t226.5 270q43 32 95.5 25t83.5 -50q32 -42 24.5 -94.5t-49.5 -84.5q-98 -74 -151.5 -181t-53.5 -228q0 -104 40.5 -198.5t109.5 -163.5t163.5 -109.5 t198.5 -40.5t198.5 40.5t163.5 109.5t109.5 163.5t40.5 198.5q0 121 -53.5 228t-151.5 181q-42 32 -49.5 84.5t24.5 94.5q31 43 84 50t95 -25q146 -109 226.5 -270t80.5 -343zM896 1408v-640q0 -52 -38 -90t-90 -38t-90 38t-38 90v640q0 52 38 90t90 38t90 -38t38 -90z" />
+<glyph unicode="&#xf012;" horiz-adv-x="1792" d="M256 96v-192q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM640 224v-320q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v320q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1024 480v-576q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23 v576q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1408 864v-960q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v960q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1792 1376v-1472q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v1472q0 14 9 23t23 9h192q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf013;" d="M1024 640q0 106 -75 181t-181 75t-181 -75t-75 -181t75 -181t181 -75t181 75t75 181zM1536 749v-222q0 -12 -8 -23t-20 -13l-185 -28q-19 -54 -39 -91q35 -50 107 -138q10 -12 10 -25t-9 -23q-27 -37 -99 -108t-94 -71q-12 0 -26 9l-138 108q-44 -23 -91 -38 q-16 -136 -29 -186q-7 -28 -36 -28h-222q-14 0 -24.5 8.5t-11.5 21.5l-28 184q-49 16 -90 37l-141 -107q-10 -9 -25 -9q-14 0 -25 11q-126 114 -165 168q-7 10 -7 23q0 12 8 23q15 21 51 66.5t54 70.5q-27 50 -41 99l-183 27q-13 2 -21 12.5t-8 23.5v222q0 12 8 23t19 13 l186 28q14 46 39 92q-40 57 -107 138q-10 12 -10 24q0 10 9 23q26 36 98.5 107.5t94.5 71.5q13 0 26 -10l138 -107q44 23 91 38q16 136 29 186q7 28 36 28h222q14 0 24.5 -8.5t11.5 -21.5l28 -184q49 -16 90 -37l142 107q9 9 24 9q13 0 25 -10q129 -119 165 -170q7 -8 7 -22 q0 -12 -8 -23q-15 -21 -51 -66.5t-54 -70.5q26 -50 41 -98l183 -28q13 -2 21 -12.5t8 -23.5z" />
+<glyph unicode="&#xf014;" horiz-adv-x="1408" d="M512 800v-576q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v576q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM768 800v-576q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v576q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM1024 800v-576q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v576 q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM1152 76v948h-896v-948q0 -22 7 -40.5t14.5 -27t10.5 -8.5h832q3 0 10.5 8.5t14.5 27t7 40.5zM480 1152h448l-48 117q-7 9 -17 11h-317q-10 -2 -17 -11zM1408 1120v-64q0 -14 -9 -23t-23 -9h-96v-948q0 -83 -47 -143.5t-113 -60.5h-832 q-66 0 -113 58.5t-47 141.5v952h-96q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h309l70 167q15 37 54 63t79 26h320q40 0 79 -26t54 -63l70 -167h309q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf015;" horiz-adv-x="1664" d="M1408 544v-480q0 -26 -19 -45t-45 -19h-384v384h-256v-384h-384q-26 0 -45 19t-19 45v480q0 1 0.5 3t0.5 3l575 474l575 -474q1 -2 1 -6zM1631 613l-62 -74q-8 -9 -21 -11h-3q-13 0 -21 7l-692 577l-692 -577q-12 -8 -24 -7q-13 2 -21 11l-62 74q-8 10 -7 23.5t11 21.5 l719 599q32 26 76 26t76 -26l244 -204v195q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-408l219 -182q10 -8 11 -21.5t-7 -23.5z" />
+<glyph unicode="&#xf016;" horiz-adv-x="1280" d="M128 0h1024v768h-416q-40 0 -68 28t-28 68v416h-512v-1280zM768 896h376q-10 29 -22 41l-313 313q-12 12 -41 22v-376zM1280 864v-896q0 -40 -28 -68t-68 -28h-1088q-40 0 -68 28t-28 68v1344q0 40 28 68t68 28h640q40 0 88 -20t76 -48l312 -312q28 -28 48 -76t20 -88z " />
+<glyph unicode="&#xf017;" d="M896 992v-448q0 -14 -9 -23t-23 -9h-320q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h224v352q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf018;" horiz-adv-x="1920" d="M1111 540v4l-24 320q-1 13 -11 22.5t-23 9.5h-186q-13 0 -23 -9.5t-11 -22.5l-24 -320v-4q-1 -12 8 -20t21 -8h244q12 0 21 8t8 20zM1870 73q0 -73 -46 -73h-704q13 0 22 9.5t8 22.5l-20 256q-1 13 -11 22.5t-23 9.5h-272q-13 0 -23 -9.5t-11 -22.5l-20 -256 q-1 -13 8 -22.5t22 -9.5h-704q-46 0 -46 73q0 54 26 116l417 1044q8 19 26 33t38 14h339q-13 0 -23 -9.5t-11 -22.5l-15 -192q-1 -14 8 -23t22 -9h166q13 0 22 9t8 23l-15 192q-1 13 -11 22.5t-23 9.5h339q20 0 38 -14t26 -33l417 -1044q26 -62 26 -116z" />
+<glyph unicode="&#xf019;" horiz-adv-x="1664" d="M1280 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1536 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1664 416v-320q0 -40 -28 -68t-68 -28h-1472q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h465l135 -136 q58 -56 136 -56t136 56l136 136h464q40 0 68 -28t28 -68zM1339 985q17 -41 -14 -70l-448 -448q-18 -19 -45 -19t-45 19l-448 448q-31 29 -14 70q17 39 59 39h256v448q0 26 19 45t45 19h256q26 0 45 -19t19 -45v-448h256q42 0 59 -39z" />
+<glyph unicode="&#xf01a;" d="M1120 608q0 -12 -10 -24l-319 -319q-11 -9 -23 -9t-23 9l-320 320q-15 16 -7 35q8 20 30 20h192v352q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-352h192q14 0 23 -9t9 -23zM768 1184q-148 0 -273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273 t-73 273t-198 198t-273 73zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf01b;" d="M1118 660q-8 -20 -30 -20h-192v-352q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v352h-192q-14 0 -23 9t-9 23q0 12 10 24l319 319q11 9 23 9t23 -9l320 -320q15 -16 7 -35zM768 1184q-148 0 -273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198 t73 273t-73 273t-198 198t-273 73zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf01c;" d="M1023 576h316q-1 3 -2.5 8t-2.5 8l-212 496h-708l-212 -496q-1 -2 -2.5 -8t-2.5 -8h316l95 -192h320zM1536 546v-482q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v482q0 62 25 123l238 552q10 25 36.5 42t52.5 17h832q26 0 52.5 -17t36.5 -42l238 -552 q25 -61 25 -123z" />
+<glyph unicode="&#xf01d;" d="M1184 640q0 -37 -32 -55l-544 -320q-15 -9 -32 -9q-16 0 -32 8q-32 19 -32 56v640q0 37 32 56q33 18 64 -1l544 -320q32 -18 32 -55zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf01e;" d="M1536 1280v-448q0 -26 -19 -45t-45 -19h-448q-42 0 -59 40q-17 39 14 69l138 138q-148 137 -349 137q-104 0 -198.5 -40.5t-163.5 -109.5t-109.5 -163.5t-40.5 -198.5t40.5 -198.5t109.5 -163.5t163.5 -109.5t198.5 -40.5q119 0 225 52t179 147q7 10 23 12q14 0 25 -9 l137 -138q9 -8 9.5 -20.5t-7.5 -22.5q-109 -132 -264 -204.5t-327 -72.5q-156 0 -298 61t-245 164t-164 245t-61 298t61 298t164 245t245 164t298 61q147 0 284.5 -55.5t244.5 -156.5l130 129q29 31 70 14q39 -17 39 -59z" />
+<glyph unicode="&#xf021;" d="M1511 480q0 -5 -1 -7q-64 -268 -268 -434.5t-478 -166.5q-146 0 -282.5 55t-243.5 157l-129 -129q-19 -19 -45 -19t-45 19t-19 45v448q0 26 19 45t45 19h448q26 0 45 -19t19 -45t-19 -45l-137 -137q71 -66 161 -102t187 -36q134 0 250 65t186 179q11 17 53 117 q8 23 30 23h192q13 0 22.5 -9.5t9.5 -22.5zM1536 1280v-448q0 -26 -19 -45t-45 -19h-448q-26 0 -45 19t-19 45t19 45l138 138q-148 137 -349 137q-134 0 -250 -65t-186 -179q-11 -17 -53 -117q-8 -23 -30 -23h-199q-13 0 -22.5 9.5t-9.5 22.5v7q65 268 270 434.5t480 166.5 q146 0 284 -55.5t245 -156.5l130 129q19 19 45 19t45 -19t19 -45z" />
+<glyph unicode="&#xf022;" horiz-adv-x="1792" d="M384 352v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 608v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M384 864v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1536 352v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-960q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h960q13 0 22.5 -9.5t9.5 -22.5z M1536 608v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-960q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h960q13 0 22.5 -9.5t9.5 -22.5zM1536 864v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-960q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h960q13 0 22.5 -9.5 t9.5 -22.5zM1664 160v832q0 13 -9.5 22.5t-22.5 9.5h-1472q-13 0 -22.5 -9.5t-9.5 -22.5v-832q0 -13 9.5 -22.5t22.5 -9.5h1472q13 0 22.5 9.5t9.5 22.5zM1792 1248v-1088q0 -66 -47 -113t-113 -47h-1472q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1472q66 0 113 -47 t47 -113z" />
+<glyph unicode="&#xf023;" horiz-adv-x="1152" d="M320 768h512v192q0 106 -75 181t-181 75t-181 -75t-75 -181v-192zM1152 672v-576q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v576q0 40 28 68t68 28h32v192q0 184 132 316t316 132t316 -132t132 -316v-192h32q40 0 68 -28t28 -68z" />
+<glyph unicode="&#xf024;" horiz-adv-x="1792" d="M320 1280q0 -72 -64 -110v-1266q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v1266q-64 38 -64 110q0 53 37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1792 1216v-763q0 -25 -12.5 -38.5t-39.5 -27.5q-215 -116 -369 -116q-61 0 -123.5 22t-108.5 48 t-115.5 48t-142.5 22q-192 0 -464 -146q-17 -9 -33 -9q-26 0 -45 19t-19 45v742q0 32 31 55q21 14 79 43q236 120 421 120q107 0 200 -29t219 -88q38 -19 88 -19q54 0 117.5 21t110 47t88 47t54.5 21q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf025;" horiz-adv-x="1664" d="M1664 650q0 -166 -60 -314l-20 -49l-185 -33q-22 -83 -90.5 -136.5t-156.5 -53.5v-32q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v576q0 14 9 23t23 9h64q14 0 23 -9t9 -23v-32q71 0 130 -35.5t93 -95.5l68 12q29 95 29 193q0 148 -88 279t-236.5 209t-315.5 78 t-315.5 -78t-236.5 -209t-88 -279q0 -98 29 -193l68 -12q34 60 93 95.5t130 35.5v32q0 14 9 23t23 9h64q14 0 23 -9t9 -23v-576q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v32q-88 0 -156.5 53.5t-90.5 136.5l-185 33l-20 49q-60 148 -60 314q0 151 67 291t179 242.5 t266 163.5t320 61t320 -61t266 -163.5t179 -242.5t67 -291z" />
+<glyph unicode="&#xf026;" horiz-adv-x="768" d="M768 1184v-1088q0 -26 -19 -45t-45 -19t-45 19l-333 333h-262q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h262l333 333q19 19 45 19t45 -19t19 -45z" />
+<glyph unicode="&#xf027;" horiz-adv-x="1152" d="M768 1184v-1088q0 -26 -19 -45t-45 -19t-45 19l-333 333h-262q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h262l333 333q19 19 45 19t45 -19t19 -45zM1152 640q0 -76 -42.5 -141.5t-112.5 -93.5q-10 -5 -25 -5q-26 0 -45 18.5t-19 45.5q0 21 12 35.5t29 25t34 23t29 35.5 t12 57t-12 57t-29 35.5t-34 23t-29 25t-12 35.5q0 27 19 45.5t45 18.5q15 0 25 -5q70 -27 112.5 -93t42.5 -142z" />
+<glyph unicode="&#xf028;" horiz-adv-x="1664" d="M768 1184v-1088q0 -26 -19 -45t-45 -19t-45 19l-333 333h-262q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h262l333 333q19 19 45 19t45 -19t19 -45zM1152 640q0 -76 -42.5 -141.5t-112.5 -93.5q-10 -5 -25 -5q-26 0 -45 18.5t-19 45.5q0 21 12 35.5t29 25t34 23t29 35.5 t12 57t-12 57t-29 35.5t-34 23t-29 25t-12 35.5q0 27 19 45.5t45 18.5q15 0 25 -5q70 -27 112.5 -93t42.5 -142zM1408 640q0 -153 -85 -282.5t-225 -188.5q-13 -5 -25 -5q-27 0 -46 19t-19 45q0 39 39 59q56 29 76 44q74 54 115.5 135.5t41.5 173.5t-41.5 173.5 t-115.5 135.5q-20 15 -76 44q-39 20 -39 59q0 26 19 45t45 19q13 0 26 -5q140 -59 225 -188.5t85 -282.5zM1664 640q0 -230 -127 -422.5t-338 -283.5q-13 -5 -26 -5q-26 0 -45 19t-19 45q0 36 39 59q7 4 22.5 10.5t22.5 10.5q46 25 82 51q123 91 192 227t69 289t-69 289 t-192 227q-36 26 -82 51q-7 4 -22.5 10.5t-22.5 10.5q-39 23 -39 59q0 26 19 45t45 19q13 0 26 -5q211 -91 338 -283.5t127 -422.5z" />
+<glyph unicode="&#xf029;" horiz-adv-x="1408" d="M384 384v-128h-128v128h128zM384 1152v-128h-128v128h128zM1152 1152v-128h-128v128h128zM128 129h384v383h-384v-383zM128 896h384v384h-384v-384zM896 896h384v384h-384v-384zM640 640v-640h-640v640h640zM1152 128v-128h-128v128h128zM1408 128v-128h-128v128h128z M1408 640v-384h-384v128h-128v-384h-128v640h384v-128h128v128h128zM640 1408v-640h-640v640h640zM1408 1408v-640h-640v640h640z" />
+<glyph unicode="&#xf02a;" horiz-adv-x="1792" d="M63 0h-63v1408h63v-1408zM126 1h-32v1407h32v-1407zM220 1h-31v1407h31v-1407zM377 1h-31v1407h31v-1407zM534 1h-62v1407h62v-1407zM660 1h-31v1407h31v-1407zM723 1h-31v1407h31v-1407zM786 1h-31v1407h31v-1407zM943 1h-63v1407h63v-1407zM1100 1h-63v1407h63v-1407z M1226 1h-63v1407h63v-1407zM1352 1h-63v1407h63v-1407zM1446 1h-63v1407h63v-1407zM1635 1h-94v1407h94v-1407zM1698 1h-32v1407h32v-1407zM1792 0h-63v1408h63v-1408z" />
+<glyph unicode="&#xf02b;" d="M448 1088q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1515 512q0 -53 -37 -90l-491 -492q-39 -37 -91 -37q-53 0 -90 37l-715 716q-38 37 -64.5 101t-26.5 117v416q0 52 38 90t90 38h416q53 0 117 -26.5t102 -64.5 l715 -714q37 -39 37 -91z" />
+<glyph unicode="&#xf02c;" horiz-adv-x="1920" d="M448 1088q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1515 512q0 -53 -37 -90l-491 -492q-39 -37 -91 -37q-53 0 -90 37l-715 716q-38 37 -64.5 101t-26.5 117v416q0 52 38 90t90 38h416q53 0 117 -26.5t102 -64.5 l715 -714q37 -39 37 -91zM1899 512q0 -53 -37 -90l-491 -492q-39 -37 -91 -37q-36 0 -59 14t-53 45l470 470q37 37 37 90q0 52 -37 91l-715 714q-38 38 -102 64.5t-117 26.5h224q53 0 117 -26.5t102 -64.5l715 -714q37 -39 37 -91z" />
+<glyph unicode="&#xf02d;" horiz-adv-x="1664" d="M1639 1058q40 -57 18 -129l-275 -906q-19 -64 -76.5 -107.5t-122.5 -43.5h-923q-77 0 -148.5 53.5t-99.5 131.5q-24 67 -2 127q0 4 3 27t4 37q1 8 -3 21.5t-3 19.5q2 11 8 21t16.5 23.5t16.5 23.5q23 38 45 91.5t30 91.5q3 10 0.5 30t-0.5 28q3 11 17 28t17 23 q21 36 42 92t25 90q1 9 -2.5 32t0.5 28q4 13 22 30.5t22 22.5q19 26 42.5 84.5t27.5 96.5q1 8 -3 25.5t-2 26.5q2 8 9 18t18 23t17 21q8 12 16.5 30.5t15 35t16 36t19.5 32t26.5 23.5t36 11.5t47.5 -5.5l-1 -3q38 9 51 9h761q74 0 114 -56t18 -130l-274 -906 q-36 -119 -71.5 -153.5t-128.5 -34.5h-869q-27 0 -38 -15q-11 -16 -1 -43q24 -70 144 -70h923q29 0 56 15.5t35 41.5l300 987q7 22 5 57q38 -15 59 -43zM575 1056q-4 -13 2 -22.5t20 -9.5h608q13 0 25.5 9.5t16.5 22.5l21 64q4 13 -2 22.5t-20 9.5h-608q-13 0 -25.5 -9.5 t-16.5 -22.5zM492 800q-4 -13 2 -22.5t20 -9.5h608q13 0 25.5 9.5t16.5 22.5l21 64q4 13 -2 22.5t-20 9.5h-608q-13 0 -25.5 -9.5t-16.5 -22.5z" />
+<glyph unicode="&#xf02e;" horiz-adv-x="1280" d="M1164 1408q23 0 44 -9q33 -13 52.5 -41t19.5 -62v-1289q0 -34 -19.5 -62t-52.5 -41q-19 -8 -44 -8q-48 0 -83 32l-441 424l-441 -424q-36 -33 -83 -33q-23 0 -44 9q-33 13 -52.5 41t-19.5 62v1289q0 34 19.5 62t52.5 41q21 9 44 9h1048z" />
+<glyph unicode="&#xf02f;" horiz-adv-x="1664" d="M384 0h896v256h-896v-256zM384 640h896v384h-160q-40 0 -68 28t-28 68v160h-640v-640zM1536 576q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1664 576v-416q0 -13 -9.5 -22.5t-22.5 -9.5h-224v-160q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68 v160h-224q-13 0 -22.5 9.5t-9.5 22.5v416q0 79 56.5 135.5t135.5 56.5h64v544q0 40 28 68t68 28h672q40 0 88 -20t76 -48l152 -152q28 -28 48 -76t20 -88v-256h64q79 0 135.5 -56.5t56.5 -135.5z" />
+<glyph unicode="&#xf030;" horiz-adv-x="1920" d="M960 864q119 0 203.5 -84.5t84.5 -203.5t-84.5 -203.5t-203.5 -84.5t-203.5 84.5t-84.5 203.5t84.5 203.5t203.5 84.5zM1664 1280q106 0 181 -75t75 -181v-896q0 -106 -75 -181t-181 -75h-1408q-106 0 -181 75t-75 181v896q0 106 75 181t181 75h224l51 136 q19 49 69.5 84.5t103.5 35.5h512q53 0 103.5 -35.5t69.5 -84.5l51 -136h224zM960 128q185 0 316.5 131.5t131.5 316.5t-131.5 316.5t-316.5 131.5t-316.5 -131.5t-131.5 -316.5t131.5 -316.5t316.5 -131.5z" />
+<glyph unicode="&#xf031;" horiz-adv-x="1664" d="M725 977l-170 -450q73 -1 153.5 -2t119 -1.5t52.5 -0.5l29 2q-32 95 -92 241q-53 132 -92 211zM21 -128h-21l2 79q22 7 80 18q89 16 110 31q20 16 48 68l237 616l280 724h75h53l11 -21l205 -480q103 -242 124 -297q39 -102 96 -235q26 -58 65 -164q24 -67 65 -149 q22 -49 35 -57q22 -19 69 -23q47 -6 103 -27q6 -39 6 -57q0 -14 -1 -26q-80 0 -192 8q-93 8 -189 8q-79 0 -135 -2l-200 -11l-58 -2q0 45 4 78l131 28q56 13 68 23q12 12 12 27t-6 32l-47 114l-92 228l-450 2q-29 -65 -104 -274q-23 -64 -23 -84q0 -31 17 -43 q26 -21 103 -32q3 0 13.5 -2t30 -5t40.5 -6q1 -28 1 -58q0 -17 -2 -27q-66 0 -349 20l-48 -8q-81 -14 -167 -14z" />
+<glyph unicode="&#xf032;" horiz-adv-x="1408" d="M555 15q76 -32 140 -32q131 0 216 41t122 113q38 70 38 181q0 114 -41 180q-58 94 -141 126q-80 32 -247 32q-74 0 -101 -10v-144l-1 -173l3 -270q0 -15 12 -44zM541 761q43 -7 109 -7q175 0 264 65t89 224q0 112 -85 187q-84 75 -255 75q-52 0 -130 -13q0 -44 2 -77 q7 -122 6 -279l-1 -98q0 -43 1 -77zM0 -128l2 94q45 9 68 12q77 12 123 31q17 27 21 51q9 66 9 194l-2 497q-5 256 -9 404q-1 87 -11 109q-1 4 -12 12q-18 12 -69 15q-30 2 -114 13l-4 83l260 6l380 13l45 1q5 0 14 0.5t14 0.5q1 0 21.5 -0.5t40.5 -0.5h74q88 0 191 -27 q43 -13 96 -39q57 -29 102 -76q44 -47 65 -104t21 -122q0 -70 -32 -128t-95 -105q-26 -20 -150 -77q177 -41 267 -146q92 -106 92 -236q0 -76 -29 -161q-21 -62 -71 -117q-66 -72 -140 -108q-73 -36 -203 -60q-82 -15 -198 -11l-197 4q-84 2 -298 -11q-33 -3 -272 -11z" />
+<glyph unicode="&#xf033;" horiz-adv-x="1024" d="M0 -126l17 85q4 1 77 20q76 19 116 39q29 37 41 101l27 139l56 268l12 64q8 44 17 84.5t16 67t12.5 46.5t9 30.5t3.5 11.5l29 157l16 63l22 135l8 50v38q-41 22 -144 28q-28 2 -38 4l19 103l317 -14q39 -2 73 -2q66 0 214 9q33 2 68 4.5t36 2.5q-2 -19 -6 -38 q-7 -29 -13 -51q-55 -19 -109 -31q-64 -16 -101 -31q-12 -31 -24 -88q-9 -44 -13 -82q-44 -199 -66 -306l-61 -311l-38 -158l-43 -235l-12 -45q-2 -7 1 -27q64 -15 119 -21q36 -5 66 -10q-1 -29 -7 -58q-7 -31 -9 -41q-18 0 -23 -1q-24 -2 -42 -2q-9 0 -28 3q-19 4 -145 17 l-198 2q-41 1 -174 -11q-74 -7 -98 -9z" />
+<glyph unicode="&#xf034;" horiz-adv-x="1792" d="M81 1407l54 -27q20 -5 211 -5h130l19 3l115 1l215 -1h293l34 -2q14 -1 28 7t21 16l7 8l42 1q15 0 28 -1v-104.5t1 -131.5l1 -100l-1 -58q0 -32 -4 -51q-39 -15 -68 -18q-25 43 -54 128q-8 24 -15.5 62.5t-11.5 65.5t-6 29q-13 15 -27 19q-7 2 -42.5 2t-103.5 -1t-111 -1 q-34 0 -67 -5q-10 -97 -8 -136l1 -152v-332l3 -359l-1 -147q-1 -46 11 -85q49 -25 89 -32q2 0 18 -5t44 -13t43 -12q30 -8 50 -18q5 -45 5 -50q0 -10 -3 -29q-14 -1 -34 -1q-110 0 -187 10q-72 8 -238 8q-88 0 -233 -14q-48 -4 -70 -4q-2 22 -2 26l-1 26v9q21 33 79 49 q139 38 159 50q9 21 12 56q8 192 6 433l-5 428q-1 62 -0.5 118.5t0.5 102.5t-2 57t-6 15q-6 5 -14 6q-38 6 -148 6q-43 0 -100 -13.5t-73 -24.5q-13 -9 -22 -33t-22 -75t-24 -84q-6 -19 -19.5 -32t-20.5 -13q-44 27 -56 44v297v86zM1744 128q33 0 42 -18.5t-11 -44.5 l-126 -162q-20 -26 -49 -26t-49 26l-126 162q-20 26 -11 44.5t42 18.5h80v1024h-80q-33 0 -42 18.5t11 44.5l126 162q20 26 49 26t49 -26l126 -162q20 -26 11 -44.5t-42 -18.5h-80v-1024h80z" />
+<glyph unicode="&#xf035;" d="M81 1407l54 -27q20 -5 211 -5h130l19 3l115 1l446 -1h318l34 -2q14 -1 28 7t21 16l7 8l42 1q15 0 28 -1v-104.5t1 -131.5l1 -100l-1 -58q0 -32 -4 -51q-39 -15 -68 -18q-25 43 -54 128q-8 24 -15.5 62.5t-11.5 65.5t-6 29q-13 15 -27 19q-7 2 -58.5 2t-138.5 -1t-128 -1 q-94 0 -127 -5q-10 -97 -8 -136l1 -152v52l3 -359l-1 -147q-1 -46 11 -85q49 -25 89 -32q2 0 18 -5t44 -13t43 -12q30 -8 50 -18q5 -45 5 -50q0 -10 -3 -29q-14 -1 -34 -1q-110 0 -187 10q-72 8 -238 8q-82 0 -233 -13q-45 -5 -70 -5q-2 22 -2 26l-1 26v9q21 33 79 49 q139 38 159 50q9 21 12 56q6 137 6 433l-5 44q0 265 -2 278q-2 11 -6 15q-6 5 -14 6q-38 6 -148 6q-50 0 -168.5 -14t-132.5 -24q-13 -9 -22 -33t-22 -75t-24 -84q-6 -19 -19.5 -32t-20.5 -13q-44 27 -56 44v297v86zM1505 113q26 -20 26 -49t-26 -49l-162 -126 q-26 -20 -44.5 -11t-18.5 42v80h-1024v-80q0 -33 -18.5 -42t-44.5 11l-162 126q-26 20 -26 49t26 49l162 126q26 20 44.5 11t18.5 -42v-80h1024v80q0 33 18.5 42t44.5 -11z" />
+<glyph unicode="&#xf036;" horiz-adv-x="1792" d="M1792 192v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1408 576v-128q0 -26 -19 -45t-45 -19h-1280q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1280q26 0 45 -19t19 -45zM1664 960v-128q0 -26 -19 -45 t-45 -19h-1536q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1536q26 0 45 -19t19 -45zM1280 1344v-128q0 -26 -19 -45t-45 -19h-1152q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1152q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf037;" horiz-adv-x="1792" d="M1792 192v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1408 576v-128q0 -26 -19 -45t-45 -19h-896q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h896q26 0 45 -19t19 -45zM1664 960v-128q0 -26 -19 -45t-45 -19 h-1408q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1408q26 0 45 -19t19 -45zM1280 1344v-128q0 -26 -19 -45t-45 -19h-640q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h640q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf038;" horiz-adv-x="1792" d="M1792 192v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 576v-128q0 -26 -19 -45t-45 -19h-1280q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1280q26 0 45 -19t19 -45zM1792 960v-128q0 -26 -19 -45 t-45 -19h-1536q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1536q26 0 45 -19t19 -45zM1792 1344v-128q0 -26 -19 -45t-45 -19h-1152q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1152q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf039;" horiz-adv-x="1792" d="M1792 192v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 576v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 960v-128q0 -26 -19 -45 t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 1344v-128q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1664q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf03a;" horiz-adv-x="1792" d="M256 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-192q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h192q13 0 22.5 -9.5t9.5 -22.5zM256 608v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-192q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h192q13 0 22.5 -9.5 t9.5 -22.5zM256 992v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-192q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h192q13 0 22.5 -9.5t9.5 -22.5zM1792 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1344 q13 0 22.5 -9.5t9.5 -22.5zM256 1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-192q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h192q13 0 22.5 -9.5t9.5 -22.5zM1792 608v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5 t22.5 9.5h1344q13 0 22.5 -9.5t9.5 -22.5zM1792 992v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1344q13 0 22.5 -9.5t9.5 -22.5zM1792 1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v192 q0 13 9.5 22.5t22.5 9.5h1344q13 0 22.5 -9.5t9.5 -22.5z" />
+<glyph unicode="&#xf03b;" horiz-adv-x="1792" d="M384 992v-576q0 -13 -9.5 -22.5t-22.5 -9.5q-14 0 -23 9l-288 288q-9 9 -9 23t9 23l288 288q9 9 23 9q13 0 22.5 -9.5t9.5 -22.5zM1792 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1728q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1728q13 0 22.5 -9.5 t9.5 -22.5zM1792 608v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1088q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1088q13 0 22.5 -9.5t9.5 -22.5zM1792 992v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1088q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1088 q13 0 22.5 -9.5t9.5 -22.5zM1792 1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1728q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1728q13 0 22.5 -9.5t9.5 -22.5z" />
+<glyph unicode="&#xf03c;" horiz-adv-x="1792" d="M352 704q0 -14 -9 -23l-288 -288q-9 -9 -23 -9q-13 0 -22.5 9.5t-9.5 22.5v576q0 13 9.5 22.5t22.5 9.5q14 0 23 -9l288 -288q9 -9 9 -23zM1792 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1728q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1728q13 0 22.5 -9.5 t9.5 -22.5zM1792 608v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1088q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1088q13 0 22.5 -9.5t9.5 -22.5zM1792 992v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1088q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1088 q13 0 22.5 -9.5t9.5 -22.5zM1792 1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1728q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1728q13 0 22.5 -9.5t9.5 -22.5z" />
+<glyph unicode="&#xf03d;" horiz-adv-x="1792" d="M1792 1184v-1088q0 -42 -39 -59q-13 -5 -25 -5q-27 0 -45 19l-403 403v-166q0 -119 -84.5 -203.5t-203.5 -84.5h-704q-119 0 -203.5 84.5t-84.5 203.5v704q0 119 84.5 203.5t203.5 84.5h704q119 0 203.5 -84.5t84.5 -203.5v-165l403 402q18 19 45 19q12 0 25 -5 q39 -17 39 -59z" />
+<glyph unicode="&#xf03e;" horiz-adv-x="1920" d="M640 960q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM1664 576v-448h-1408v192l320 320l160 -160l512 512zM1760 1280h-1600q-13 0 -22.5 -9.5t-9.5 -22.5v-1216q0 -13 9.5 -22.5t22.5 -9.5h1600q13 0 22.5 9.5t9.5 22.5v1216 q0 13 -9.5 22.5t-22.5 9.5zM1920 1248v-1216q0 -66 -47 -113t-113 -47h-1600q-66 0 -113 47t-47 113v1216q0 66 47 113t113 47h1600q66 0 113 -47t47 -113z" />
+<glyph unicode="&#xf040;" d="M363 0l91 91l-235 235l-91 -91v-107h128v-128h107zM886 928q0 22 -22 22q-10 0 -17 -7l-542 -542q-7 -7 -7 -17q0 -22 22 -22q10 0 17 7l542 542q7 7 7 17zM832 1120l416 -416l-832 -832h-416v416zM1515 1024q0 -53 -37 -90l-166 -166l-416 416l166 165q36 38 90 38 q53 0 91 -38l235 -234q37 -39 37 -91z" />
+<glyph unicode="&#xf041;" horiz-adv-x="1024" d="M768 896q0 106 -75 181t-181 75t-181 -75t-75 -181t75 -181t181 -75t181 75t75 181zM1024 896q0 -109 -33 -179l-364 -774q-16 -33 -47.5 -52t-67.5 -19t-67.5 19t-46.5 52l-365 774q-33 70 -33 179q0 212 150 362t362 150t362 -150t150 -362z" />
+<glyph unicode="&#xf042;" d="M768 96v1088q-148 0 -273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf043;" horiz-adv-x="1024" d="M512 384q0 36 -20 69q-1 1 -15.5 22.5t-25.5 38t-25 44t-21 50.5q-4 16 -21 16t-21 -16q-7 -23 -21 -50.5t-25 -44t-25.5 -38t-15.5 -22.5q-20 -33 -20 -69q0 -53 37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1024 512q0 -212 -150 -362t-362 -150t-362 150t-150 362 q0 145 81 275q6 9 62.5 90.5t101 151t99.5 178t83 201.5q9 30 34 47t51 17t51.5 -17t33.5 -47q28 -93 83 -201.5t99.5 -178t101 -151t62.5 -90.5q81 -127 81 -275z" />
+<glyph unicode="&#xf044;" horiz-adv-x="1792" d="M888 352l116 116l-152 152l-116 -116v-56h96v-96h56zM1328 1072q-16 16 -33 -1l-350 -350q-17 -17 -1 -33t33 1l350 350q17 17 1 33zM1408 478v-190q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h832 q63 0 117 -25q15 -7 18 -23q3 -17 -9 -29l-49 -49q-14 -14 -32 -8q-23 6 -45 6h-832q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832q66 0 113 47t47 113v126q0 13 9 22l64 64q15 15 35 7t20 -29zM1312 1216l288 -288l-672 -672h-288v288zM1756 1084l-92 -92 l-288 288l92 92q28 28 68 28t68 -28l152 -152q28 -28 28 -68t-28 -68z" />
+<glyph unicode="&#xf045;" horiz-adv-x="1664" d="M1408 547v-259q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h255v0q13 0 22.5 -9.5t9.5 -22.5q0 -27 -26 -32q-77 -26 -133 -60q-10 -4 -16 -4h-112q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832 q66 0 113 47t47 113v214q0 19 18 29q28 13 54 37q16 16 35 8q21 -9 21 -29zM1645 1043l-384 -384q-18 -19 -45 -19q-12 0 -25 5q-39 17 -39 59v192h-160q-323 0 -438 -131q-119 -137 -74 -473q3 -23 -20 -34q-8 -2 -12 -2q-16 0 -26 13q-10 14 -21 31t-39.5 68.5t-49.5 99.5 t-38.5 114t-17.5 122q0 49 3.5 91t14 90t28 88t47 81.5t68.5 74t94.5 61.5t124.5 48.5t159.5 30.5t196.5 11h160v192q0 42 39 59q13 5 25 5q26 0 45 -19l384 -384q19 -19 19 -45t-19 -45z" />
+<glyph unicode="&#xf046;" horiz-adv-x="1664" d="M1408 606v-318q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h832q63 0 117 -25q15 -7 18 -23q3 -17 -9 -29l-49 -49q-10 -10 -23 -10q-3 0 -9 2q-23 6 -45 6h-832q-66 0 -113 -47t-47 -113v-832 q0 -66 47 -113t113 -47h832q66 0 113 47t47 113v254q0 13 9 22l64 64q10 10 23 10q6 0 12 -3q20 -8 20 -29zM1639 1095l-814 -814q-24 -24 -57 -24t-57 24l-430 430q-24 24 -24 57t24 57l110 110q24 24 57 24t57 -24l263 -263l647 647q24 24 57 24t57 -24l110 -110 q24 -24 24 -57t-24 -57z" />
+<glyph unicode="&#xf047;" horiz-adv-x="1792" d="M1792 640q0 -26 -19 -45l-256 -256q-19 -19 -45 -19t-45 19t-19 45v128h-384v-384h128q26 0 45 -19t19 -45t-19 -45l-256 -256q-19 -19 -45 -19t-45 19l-256 256q-19 19 -19 45t19 45t45 19h128v384h-384v-128q0 -26 -19 -45t-45 -19t-45 19l-256 256q-19 19 -19 45 t19 45l256 256q19 19 45 19t45 -19t19 -45v-128h384v384h-128q-26 0 -45 19t-19 45t19 45l256 256q19 19 45 19t45 -19l256 -256q19 -19 19 -45t-19 -45t-45 -19h-128v-384h384v128q0 26 19 45t45 19t45 -19l256 -256q19 -19 19 -45z" />
+<glyph unicode="&#xf048;" horiz-adv-x="1024" d="M979 1395q19 19 32 13t13 -32v-1472q0 -26 -13 -32t-32 13l-710 710q-9 9 -13 19v-678q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-678q4 11 13 19z" />
+<glyph unicode="&#xf049;" horiz-adv-x="1792" d="M1747 1395q19 19 32 13t13 -32v-1472q0 -26 -13 -32t-32 13l-710 710q-9 9 -13 19v-710q0 -26 -13 -32t-32 13l-710 710q-9 9 -13 19v-678q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-678q4 11 13 19l710 710 q19 19 32 13t13 -32v-710q4 11 13 19z" />
+<glyph unicode="&#xf04a;" horiz-adv-x="1664" d="M1619 1395q19 19 32 13t13 -32v-1472q0 -26 -13 -32t-32 13l-710 710q-8 9 -13 19v-710q0 -26 -13 -32t-32 13l-710 710q-19 19 -19 45t19 45l710 710q19 19 32 13t13 -32v-710q5 11 13 19z" />
+<glyph unicode="&#xf04b;" horiz-adv-x="1408" d="M1384 609l-1328 -738q-23 -13 -39.5 -3t-16.5 36v1472q0 26 16.5 36t39.5 -3l1328 -738q23 -13 23 -31t-23 -31z" />
+<glyph unicode="&#xf04c;" d="M1536 1344v-1408q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h512q26 0 45 -19t19 -45zM640 1344v-1408q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h512q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf04d;" d="M1536 1344v-1408q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h1408q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf04e;" horiz-adv-x="1664" d="M45 -115q-19 -19 -32 -13t-13 32v1472q0 26 13 32t32 -13l710 -710q8 -8 13 -19v710q0 26 13 32t32 -13l710 -710q19 -19 19 -45t-19 -45l-710 -710q-19 -19 -32 -13t-13 32v710q-5 -10 -13 -19z" />
+<glyph unicode="&#xf050;" horiz-adv-x="1792" d="M45 -115q-19 -19 -32 -13t-13 32v1472q0 26 13 32t32 -13l710 -710q8 -8 13 -19v710q0 26 13 32t32 -13l710 -710q8 -8 13 -19v678q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-1408q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v678q-5 -10 -13 -19l-710 -710 q-19 -19 -32 -13t-13 32v710q-5 -10 -13 -19z" />
+<glyph unicode="&#xf051;" horiz-adv-x="1024" d="M45 -115q-19 -19 -32 -13t-13 32v1472q0 26 13 32t32 -13l710 -710q8 -8 13 -19v678q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-1408q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v678q-5 -10 -13 -19z" />
+<glyph unicode="&#xf052;" horiz-adv-x="1538" d="M14 557l710 710q19 19 45 19t45 -19l710 -710q19 -19 13 -32t-32 -13h-1472q-26 0 -32 13t13 32zM1473 0h-1408q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h1408q26 0 45 -19t19 -45v-256q0 -26 -19 -45t-45 -19z" />
+<glyph unicode="&#xf053;" horiz-adv-x="1152" d="M742 -37l-652 651q-37 37 -37 90.5t37 90.5l652 651q37 37 90.5 37t90.5 -37l75 -75q37 -37 37 -90.5t-37 -90.5l-486 -486l486 -485q37 -38 37 -91t-37 -90l-75 -75q-37 -37 -90.5 -37t-90.5 37z" />
+<glyph unicode="&#xf054;" horiz-adv-x="1152" d="M1099 704q0 -52 -37 -91l-652 -651q-37 -37 -90 -37t-90 37l-76 75q-37 39 -37 91q0 53 37 90l486 486l-486 485q-37 39 -37 91q0 53 37 90l76 75q36 38 90 38t90 -38l652 -651q37 -37 37 -90z" />
+<glyph unicode="&#xf055;" d="M1216 576v128q0 26 -19 45t-45 19h-256v256q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-256h-256q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h256v-256q0 -26 19 -45t45 -19h128q26 0 45 19t19 45v256h256q26 0 45 19t19 45zM1536 640q0 -209 -103 -385.5 t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf056;" d="M1216 576v128q0 26 -19 45t-45 19h-768q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h768q26 0 45 19t19 45zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5 t103 -385.5z" />
+<glyph unicode="&#xf057;" d="M1149 414q0 26 -19 45l-181 181l181 181q19 19 19 45q0 27 -19 46l-90 90q-19 19 -46 19q-26 0 -45 -19l-181 -181l-181 181q-19 19 -45 19q-27 0 -46 -19l-90 -90q-19 -19 -19 -46q0 -26 19 -45l181 -181l-181 -181q-19 -19 -19 -45q0 -27 19 -46l90 -90q19 -19 46 -19 q26 0 45 19l181 181l181 -181q19 -19 45 -19q27 0 46 19l90 90q19 19 19 46zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf058;" d="M1284 802q0 28 -18 46l-91 90q-19 19 -45 19t-45 -19l-408 -407l-226 226q-19 19 -45 19t-45 -19l-91 -90q-18 -18 -18 -46q0 -27 18 -45l362 -362q19 -19 45 -19q27 0 46 19l543 543q18 18 18 45zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103 t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf059;" d="M896 160v192q0 14 -9 23t-23 9h-192q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h192q14 0 23 9t9 23zM1152 832q0 88 -55.5 163t-138.5 116t-170 41q-243 0 -371 -213q-15 -24 8 -42l132 -100q7 -6 19 -6q16 0 25 12q53 68 86 92q34 24 86 24q48 0 85.5 -26t37.5 -59 q0 -38 -20 -61t-68 -45q-63 -28 -115.5 -86.5t-52.5 -125.5v-36q0 -14 9 -23t23 -9h192q14 0 23 9t9 23q0 19 21.5 49.5t54.5 49.5q32 18 49 28.5t46 35t44.5 48t28 60.5t12.5 81zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf05a;" d="M1024 160v160q0 14 -9 23t-23 9h-96v512q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-160q0 -14 9 -23t23 -9h96v-320h-96q-14 0 -23 -9t-9 -23v-160q0 -14 9 -23t23 -9h448q14 0 23 9t9 23zM896 1056v160q0 14 -9 23t-23 9h-192q-14 0 -23 -9t-9 -23v-160q0 -14 9 -23 t23 -9h192q14 0 23 9t9 23zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf05b;" d="M1197 512h-109q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h109q-32 108 -112.5 188.5t-188.5 112.5v-109q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v109q-108 -32 -188.5 -112.5t-112.5 -188.5h109q26 0 45 -19t19 -45v-128q0 -26 -19 -45t-45 -19h-109 q32 -108 112.5 -188.5t188.5 -112.5v109q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-109q108 32 188.5 112.5t112.5 188.5zM1536 704v-128q0 -26 -19 -45t-45 -19h-143q-37 -161 -154.5 -278.5t-278.5 -154.5v-143q0 -26 -19 -45t-45 -19h-128q-26 0 -45 19t-19 45v143 q-161 37 -278.5 154.5t-154.5 278.5h-143q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h143q37 161 154.5 278.5t278.5 154.5v143q0 26 19 45t45 19h128q26 0 45 -19t19 -45v-143q161 -37 278.5 -154.5t154.5 -278.5h143q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf05c;" d="M1097 457l-146 -146q-10 -10 -23 -10t-23 10l-137 137l-137 -137q-10 -10 -23 -10t-23 10l-146 146q-10 10 -10 23t10 23l137 137l-137 137q-10 10 -10 23t10 23l146 146q10 10 23 10t23 -10l137 -137l137 137q10 10 23 10t23 -10l146 -146q10 -10 10 -23t-10 -23 l-137 -137l137 -137q10 -10 10 -23t-10 -23zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5 t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf05d;" d="M1171 723l-422 -422q-19 -19 -45 -19t-45 19l-294 294q-19 19 -19 45t19 45l102 102q19 19 45 19t45 -19l147 -147l275 275q19 19 45 19t45 -19l102 -102q19 -19 19 -45t-19 -45zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198 t273 -73t273 73t198 198t73 273zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf05e;" d="M1312 643q0 161 -87 295l-754 -753q137 -89 297 -89q111 0 211.5 43.5t173.5 116.5t116 174.5t43 212.5zM313 344l755 754q-135 91 -300 91q-148 0 -273 -73t-198 -199t-73 -274q0 -162 89 -299zM1536 643q0 -157 -61 -300t-163.5 -246t-245 -164t-298.5 -61t-298.5 61 t-245 164t-163.5 246t-61 300t61 299.5t163.5 245.5t245 164t298.5 61t298.5 -61t245 -164t163.5 -245.5t61 -299.5z" />
+<glyph unicode="&#xf060;" d="M1536 640v-128q0 -53 -32.5 -90.5t-84.5 -37.5h-704l293 -294q38 -36 38 -90t-38 -90l-75 -76q-37 -37 -90 -37q-52 0 -91 37l-651 652q-37 37 -37 90q0 52 37 91l651 650q38 38 91 38q52 0 90 -38l75 -74q38 -38 38 -91t-38 -91l-293 -293h704q52 0 84.5 -37.5 t32.5 -90.5z" />
+<glyph unicode="&#xf061;" d="M1472 576q0 -54 -37 -91l-651 -651q-39 -37 -91 -37q-51 0 -90 37l-75 75q-38 38 -38 91t38 91l293 293h-704q-52 0 -84.5 37.5t-32.5 90.5v128q0 53 32.5 90.5t84.5 37.5h704l-293 294q-38 36 -38 90t38 90l75 75q38 38 90 38q53 0 91 -38l651 -651q37 -35 37 -90z" />
+<glyph unicode="&#xf062;" horiz-adv-x="1664" d="M1611 565q0 -51 -37 -90l-75 -75q-38 -38 -91 -38q-54 0 -90 38l-294 293v-704q0 -52 -37.5 -84.5t-90.5 -32.5h-128q-53 0 -90.5 32.5t-37.5 84.5v704l-294 -293q-36 -38 -90 -38t-90 38l-75 75q-38 38 -38 90q0 53 38 91l651 651q35 37 90 37q54 0 91 -37l651 -651 q37 -39 37 -91z" />
+<glyph unicode="&#xf063;" horiz-adv-x="1664" d="M1611 704q0 -53 -37 -90l-651 -652q-39 -37 -91 -37q-53 0 -90 37l-651 652q-38 36 -38 90q0 53 38 91l74 75q39 37 91 37q53 0 90 -37l294 -294v704q0 52 38 90t90 38h128q52 0 90 -38t38 -90v-704l294 294q37 37 90 37q52 0 91 -37l75 -75q37 -39 37 -91z" />
+<glyph unicode="&#xf064;" horiz-adv-x="1792" d="M1792 896q0 -26 -19 -45l-512 -512q-19 -19 -45 -19t-45 19t-19 45v256h-224q-98 0 -175.5 -6t-154 -21.5t-133 -42.5t-105.5 -69.5t-80 -101t-48.5 -138.5t-17.5 -181q0 -55 5 -123q0 -6 2.5 -23.5t2.5 -26.5q0 -15 -8.5 -25t-23.5 -10q-16 0 -28 17q-7 9 -13 22 t-13.5 30t-10.5 24q-127 285 -127 451q0 199 53 333q162 403 875 403h224v256q0 26 19 45t45 19t45 -19l512 -512q19 -19 19 -45z" />
+<glyph unicode="&#xf065;" d="M755 480q0 -13 -10 -23l-332 -332l144 -144q19 -19 19 -45t-19 -45t-45 -19h-448q-26 0 -45 19t-19 45v448q0 26 19 45t45 19t45 -19l144 -144l332 332q10 10 23 10t23 -10l114 -114q10 -10 10 -23zM1536 1344v-448q0 -26 -19 -45t-45 -19t-45 19l-144 144l-332 -332 q-10 -10 -23 -10t-23 10l-114 114q-10 10 -10 23t10 23l332 332l-144 144q-19 19 -19 45t19 45t45 19h448q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf066;" d="M768 576v-448q0 -26 -19 -45t-45 -19t-45 19l-144 144l-332 -332q-10 -10 -23 -10t-23 10l-114 114q-10 10 -10 23t10 23l332 332l-144 144q-19 19 -19 45t19 45t45 19h448q26 0 45 -19t19 -45zM1523 1248q0 -13 -10 -23l-332 -332l144 -144q19 -19 19 -45t-19 -45 t-45 -19h-448q-26 0 -45 19t-19 45v448q0 26 19 45t45 19t45 -19l144 -144l332 332q10 10 23 10t23 -10l114 -114q10 -10 10 -23z" />
+<glyph unicode="&#xf067;" horiz-adv-x="1408" d="M1408 800v-192q0 -40 -28 -68t-68 -28h-416v-416q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v416h-416q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h416v416q0 40 28 68t68 28h192q40 0 68 -28t28 -68v-416h416q40 0 68 -28t28 -68z" />
+<glyph unicode="&#xf068;" horiz-adv-x="1408" d="M1408 800v-192q0 -40 -28 -68t-68 -28h-1216q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h1216q40 0 68 -28t28 -68z" />
+<glyph unicode="&#xf069;" horiz-adv-x="1664" d="M1482 486q46 -26 59.5 -77.5t-12.5 -97.5l-64 -110q-26 -46 -77.5 -59.5t-97.5 12.5l-266 153v-307q0 -52 -38 -90t-90 -38h-128q-52 0 -90 38t-38 90v307l-266 -153q-46 -26 -97.5 -12.5t-77.5 59.5l-64 110q-26 46 -12.5 97.5t59.5 77.5l266 154l-266 154 q-46 26 -59.5 77.5t12.5 97.5l64 110q26 46 77.5 59.5t97.5 -12.5l266 -153v307q0 52 38 90t90 38h128q52 0 90 -38t38 -90v-307l266 153q46 26 97.5 12.5t77.5 -59.5l64 -110q26 -46 12.5 -97.5t-59.5 -77.5l-266 -154z" />
+<glyph unicode="&#xf06a;" d="M768 1408q209 0 385.5 -103t279.5 -279.5t103 -385.5t-103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103zM896 161v190q0 14 -9 23.5t-22 9.5h-192q-13 0 -23 -10t-10 -23v-190q0 -13 10 -23t23 -10h192 q13 0 22 9.5t9 23.5zM894 505l18 621q0 12 -10 18q-10 8 -24 8h-220q-14 0 -24 -8q-10 -6 -10 -18l17 -621q0 -10 10 -17.5t24 -7.5h185q14 0 23.5 7.5t10.5 17.5z" />
+<glyph unicode="&#xf06b;" d="M928 180v56v468v192h-320v-192v-468v-56q0 -25 18 -38.5t46 -13.5h192q28 0 46 13.5t18 38.5zM472 1024h195l-126 161q-26 31 -69 31q-40 0 -68 -28t-28 -68t28 -68t68 -28zM1160 1120q0 40 -28 68t-68 28q-43 0 -69 -31l-125 -161h194q40 0 68 28t28 68zM1536 864v-320 q0 -14 -9 -23t-23 -9h-96v-416q0 -40 -28 -68t-68 -28h-1088q-40 0 -68 28t-28 68v416h-96q-14 0 -23 9t-9 23v320q0 14 9 23t23 9h440q-93 0 -158.5 65.5t-65.5 158.5t65.5 158.5t158.5 65.5q107 0 168 -77l128 -165l128 165q61 77 168 77q93 0 158.5 -65.5t65.5 -158.5 t-65.5 -158.5t-158.5 -65.5h440q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf06c;" horiz-adv-x="1792" d="M1280 832q0 26 -19 45t-45 19q-172 0 -318 -49.5t-259.5 -134t-235.5 -219.5q-19 -21 -19 -45q0 -26 19 -45t45 -19q24 0 45 19q27 24 74 71t67 66q137 124 268.5 176t313.5 52q26 0 45 19t19 45zM1792 1030q0 -95 -20 -193q-46 -224 -184.5 -383t-357.5 -268 q-214 -108 -438 -108q-148 0 -286 47q-15 5 -88 42t-96 37q-16 0 -39.5 -32t-45 -70t-52.5 -70t-60 -32q-30 0 -51 11t-31 24t-27 42q-2 4 -6 11t-5.5 10t-3 9.5t-1.5 13.5q0 35 31 73.5t68 65.5t68 56t31 48q0 4 -14 38t-16 44q-9 51 -9 104q0 115 43.5 220t119 184.5 t170.5 139t204 95.5q55 18 145 25.5t179.5 9t178.5 6t163.5 24t113.5 56.5l29.5 29.5t29.5 28t27 20t36.5 16t43.5 4.5q39 0 70.5 -46t47.5 -112t24 -124t8 -96z" />
+<glyph unicode="&#xf06d;" horiz-adv-x="1408" d="M1408 -160v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-1344q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h1344q13 0 22.5 -9.5t9.5 -22.5zM1152 896q0 -78 -24.5 -144t-64 -112.5t-87.5 -88t-96 -77.5t-87.5 -72t-64 -81.5t-24.5 -96.5q0 -96 67 -224l-4 1l1 -1 q-90 41 -160 83t-138.5 100t-113.5 122.5t-72.5 150.5t-27.5 184q0 78 24.5 144t64 112.5t87.5 88t96 77.5t87.5 72t64 81.5t24.5 96.5q0 94 -66 224l3 -1l-1 1q90 -41 160 -83t138.5 -100t113.5 -122.5t72.5 -150.5t27.5 -184z" />
+<glyph unicode="&#xf06e;" horiz-adv-x="1792" d="M1664 576q-152 236 -381 353q61 -104 61 -225q0 -185 -131.5 -316.5t-316.5 -131.5t-316.5 131.5t-131.5 316.5q0 121 61 225q-229 -117 -381 -353q133 -205 333.5 -326.5t434.5 -121.5t434.5 121.5t333.5 326.5zM944 960q0 20 -14 34t-34 14q-125 0 -214.5 -89.5 t-89.5 -214.5q0 -20 14 -34t34 -14t34 14t14 34q0 86 61 147t147 61q20 0 34 14t14 34zM1792 576q0 -34 -20 -69q-140 -230 -376.5 -368.5t-499.5 -138.5t-499.5 139t-376.5 368q-20 35 -20 69t20 69q140 229 376.5 368t499.5 139t499.5 -139t376.5 -368q20 -35 20 -69z" />
+<glyph unicode="&#xf070;" horiz-adv-x="1792" d="M555 201l78 141q-87 63 -136 159t-49 203q0 121 61 225q-229 -117 -381 -353q167 -258 427 -375zM944 960q0 20 -14 34t-34 14q-125 0 -214.5 -89.5t-89.5 -214.5q0 -20 14 -34t34 -14t34 14t14 34q0 86 61 147t147 61q20 0 34 14t14 34zM1307 1151q0 -7 -1 -9 q-105 -188 -315 -566t-316 -567l-49 -89q-10 -16 -28 -16q-12 0 -134 70q-16 10 -16 28q0 12 44 87q-143 65 -263.5 173t-208.5 245q-20 31 -20 69t20 69q153 235 380 371t496 136q89 0 180 -17l54 97q10 16 28 16q5 0 18 -6t31 -15.5t33 -18.5t31.5 -18.5t19.5 -11.5 q16 -10 16 -27zM1344 704q0 -139 -79 -253.5t-209 -164.5l280 502q8 -45 8 -84zM1792 576q0 -35 -20 -69q-39 -64 -109 -145q-150 -172 -347.5 -267t-419.5 -95l74 132q212 18 392.5 137t301.5 307q-115 179 -282 294l63 112q95 -64 182.5 -153t144.5 -184q20 -34 20 -69z " />
+<glyph unicode="&#xf071;" horiz-adv-x="1792" d="M1024 161v190q0 14 -9.5 23.5t-22.5 9.5h-192q-13 0 -22.5 -9.5t-9.5 -23.5v-190q0 -14 9.5 -23.5t22.5 -9.5h192q13 0 22.5 9.5t9.5 23.5zM1022 535l18 459q0 12 -10 19q-13 11 -24 11h-220q-11 0 -24 -11q-10 -7 -10 -21l17 -457q0 -10 10 -16.5t24 -6.5h185 q14 0 23.5 6.5t10.5 16.5zM1008 1469l768 -1408q35 -63 -2 -126q-17 -29 -46.5 -46t-63.5 -17h-1536q-34 0 -63.5 17t-46.5 46q-37 63 -2 126l768 1408q17 31 47 49t65 18t65 -18t47 -49z" />
+<glyph unicode="&#xf072;" horiz-adv-x="1408" d="M1376 1376q44 -52 12 -148t-108 -172l-161 -161l160 -696q5 -19 -12 -33l-128 -96q-7 -6 -19 -6q-4 0 -7 1q-15 3 -21 16l-279 508l-259 -259l53 -194q5 -17 -8 -31l-96 -96q-9 -9 -23 -9h-2q-15 2 -24 13l-189 252l-252 189q-11 7 -13 23q-1 13 9 25l96 97q9 9 23 9 q6 0 8 -1l194 -53l259 259l-508 279q-14 8 -17 24q-2 16 9 27l128 128q14 13 30 8l665 -159l160 160q76 76 172 108t148 -12z" />
+<glyph unicode="&#xf073;" horiz-adv-x="1664" d="M128 -128h288v288h-288v-288zM480 -128h320v288h-320v-288zM128 224h288v320h-288v-320zM480 224h320v320h-320v-320zM128 608h288v288h-288v-288zM864 -128h320v288h-320v-288zM480 608h320v288h-320v-288zM1248 -128h288v288h-288v-288zM864 224h320v320h-320v-320z M512 1088v288q0 13 -9.5 22.5t-22.5 9.5h-64q-13 0 -22.5 -9.5t-9.5 -22.5v-288q0 -13 9.5 -22.5t22.5 -9.5h64q13 0 22.5 9.5t9.5 22.5zM1248 224h288v320h-288v-320zM864 608h320v288h-320v-288zM1248 608h288v288h-288v-288zM1280 1088v288q0 13 -9.5 22.5t-22.5 9.5h-64 q-13 0 -22.5 -9.5t-9.5 -22.5v-288q0 -13 9.5 -22.5t22.5 -9.5h64q13 0 22.5 9.5t9.5 22.5zM1664 1152v-1280q0 -52 -38 -90t-90 -38h-1408q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h128v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h384v96q0 66 47 113t113 47 h64q66 0 113 -47t47 -113v-96h128q52 0 90 -38t38 -90z" />
+<glyph unicode="&#xf074;" horiz-adv-x="1792" d="M666 1055q-60 -92 -137 -273q-22 45 -37 72.5t-40.5 63.5t-51 56.5t-63 35t-81.5 14.5h-224q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h224q250 0 410 -225zM1792 256q0 -14 -9 -23l-320 -320q-9 -9 -23 -9q-13 0 -22.5 9.5t-9.5 22.5v192q-32 0 -85 -0.5t-81 -1t-73 1 t-71 5t-64 10.5t-63 18.5t-58 28.5t-59 40t-55 53.5t-56 69.5q59 93 136 273q22 -45 37 -72.5t40.5 -63.5t51 -56.5t63 -35t81.5 -14.5h256v192q0 14 9 23t23 9q12 0 24 -10l319 -319q9 -9 9 -23zM1792 1152q0 -14 -9 -23l-320 -320q-9 -9 -23 -9q-13 0 -22.5 9.5t-9.5 22.5 v192h-256q-48 0 -87 -15t-69 -45t-51 -61.5t-45 -77.5q-32 -62 -78 -171q-29 -66 -49.5 -111t-54 -105t-64 -100t-74 -83t-90 -68.5t-106.5 -42t-128 -16.5h-224q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h224q48 0 87 15t69 45t51 61.5t45 77.5q32 62 78 171q29 66 49.5 111 t54 105t64 100t74 83t90 68.5t106.5 42t128 16.5h256v192q0 14 9 23t23 9q12 0 24 -10l319 -319q9 -9 9 -23z" />
+<glyph unicode="&#xf075;" horiz-adv-x="1792" d="M1792 640q0 -174 -120 -321.5t-326 -233t-450 -85.5q-70 0 -145 8q-198 -175 -460 -242q-49 -14 -114 -22q-17 -2 -30.5 9t-17.5 29v1q-3 4 -0.5 12t2 10t4.5 9.5l6 9t7 8.5t8 9q7 8 31 34.5t34.5 38t31 39.5t32.5 51t27 59t26 76q-157 89 -247.5 220t-90.5 281 q0 130 71 248.5t191 204.5t286 136.5t348 50.5q244 0 450 -85.5t326 -233t120 -321.5z" />
+<glyph unicode="&#xf076;" d="M1536 704v-128q0 -201 -98.5 -362t-274 -251.5t-395.5 -90.5t-395.5 90.5t-274 251.5t-98.5 362v128q0 26 19 45t45 19h384q26 0 45 -19t19 -45v-128q0 -52 23.5 -90t53.5 -57t71 -30t64 -13t44 -2t44 2t64 13t71 30t53.5 57t23.5 90v128q0 26 19 45t45 19h384 q26 0 45 -19t19 -45zM512 1344v-384q0 -26 -19 -45t-45 -19h-384q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h384q26 0 45 -19t19 -45zM1536 1344v-384q0 -26 -19 -45t-45 -19h-384q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h384q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf077;" horiz-adv-x="1664" d="M1611 320q0 -53 -37 -90l-75 -75q-38 -38 -91 -38q-54 0 -90 38l-486 485l-486 -485q-36 -38 -90 -38t-90 38l-75 75q-38 36 -38 90q0 53 38 91l651 651q37 37 90 37q52 0 91 -37l650 -651q38 -38 38 -91z" />
+<glyph unicode="&#xf078;" horiz-adv-x="1664" d="M1611 832q0 -53 -37 -90l-651 -651q-38 -38 -91 -38q-54 0 -90 38l-651 651q-38 36 -38 90q0 53 38 91l74 75q39 37 91 37q53 0 90 -37l486 -486l486 486q37 37 90 37q52 0 91 -37l75 -75q37 -39 37 -91z" />
+<glyph unicode="&#xf079;" horiz-adv-x="1920" d="M1280 32q0 -13 -9.5 -22.5t-22.5 -9.5h-960q-8 0 -13.5 2t-9 7t-5.5 8t-3 11.5t-1 11.5v13v11v160v416h-192q-26 0 -45 19t-19 45q0 24 15 41l320 384q19 22 49 22t49 -22l320 -384q15 -17 15 -41q0 -26 -19 -45t-45 -19h-192v-384h576q16 0 25 -11l160 -192q7 -11 7 -21 zM1920 448q0 -24 -15 -41l-320 -384q-20 -23 -49 -23t-49 23l-320 384q-15 17 -15 41q0 26 19 45t45 19h192v384h-576q-16 0 -25 12l-160 192q-7 9 -7 20q0 13 9.5 22.5t22.5 9.5h960q8 0 13.5 -2t9 -7t5.5 -8t3 -11.5t1 -11.5v-13v-11v-160v-416h192q26 0 45 -19t19 -45z " />
+<glyph unicode="&#xf07a;" horiz-adv-x="1664" d="M640 0q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1536 0q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1664 1088v-512q0 -24 -16 -42.5t-41 -21.5 l-1044 -122q1 -7 4.5 -21.5t6 -26.5t2.5 -22q0 -16 -24 -64h920q26 0 45 -19t19 -45t-19 -45t-45 -19h-1024q-26 0 -45 19t-19 45q0 14 11 39.5t29.5 59.5t20.5 38l-177 823h-204q-26 0 -45 19t-19 45t19 45t45 19h256q16 0 28.5 -6.5t20 -15.5t13 -24.5t7.5 -26.5 t5.5 -29.5t4.5 -25.5h1201q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf07b;" horiz-adv-x="1664" d="M1664 928v-704q0 -92 -66 -158t-158 -66h-1216q-92 0 -158 66t-66 158v960q0 92 66 158t158 66h320q92 0 158 -66t66 -158v-32h672q92 0 158 -66t66 -158z" />
+<glyph unicode="&#xf07c;" horiz-adv-x="1920" d="M1879 584q0 -31 -31 -66l-336 -396q-43 -51 -120.5 -86.5t-143.5 -35.5h-1088q-34 0 -60.5 13t-26.5 43q0 31 31 66l336 396q43 51 120.5 86.5t143.5 35.5h1088q34 0 60.5 -13t26.5 -43zM1536 928v-160h-832q-94 0 -197 -47.5t-164 -119.5l-337 -396l-5 -6q0 4 -0.5 12.5 t-0.5 12.5v960q0 92 66 158t158 66h320q92 0 158 -66t66 -158v-32h544q92 0 158 -66t66 -158z" />
+<glyph unicode="&#xf07d;" horiz-adv-x="768" d="M704 1216q0 -26 -19 -45t-45 -19h-128v-1024h128q26 0 45 -19t19 -45t-19 -45l-256 -256q-19 -19 -45 -19t-45 19l-256 256q-19 19 -19 45t19 45t45 19h128v1024h-128q-26 0 -45 19t-19 45t19 45l256 256q19 19 45 19t45 -19l256 -256q19 -19 19 -45z" />
+<glyph unicode="&#xf07e;" horiz-adv-x="1792" d="M1792 640q0 -26 -19 -45l-256 -256q-19 -19 -45 -19t-45 19t-19 45v128h-1024v-128q0 -26 -19 -45t-45 -19t-45 19l-256 256q-19 19 -19 45t19 45l256 256q19 19 45 19t45 -19t19 -45v-128h1024v128q0 26 19 45t45 19t45 -19l256 -256q19 -19 19 -45z" />
+<glyph unicode="&#xf080;" horiz-adv-x="1920" d="M512 512v-384h-256v384h256zM896 1024v-896h-256v896h256zM1280 768v-640h-256v640h256zM1664 1152v-1024h-256v1024h256zM1792 32v1216q0 13 -9.5 22.5t-22.5 9.5h-1600q-13 0 -22.5 -9.5t-9.5 -22.5v-1216q0 -13 9.5 -22.5t22.5 -9.5h1600q13 0 22.5 9.5t9.5 22.5z M1920 1248v-1216q0 -66 -47 -113t-113 -47h-1600q-66 0 -113 47t-47 113v1216q0 66 47 113t113 47h1600q66 0 113 -47t47 -113z" />
+<glyph unicode="&#xf081;" d="M1280 926q-56 -25 -121 -34q68 40 93 117q-65 -38 -134 -51q-61 66 -153 66q-87 0 -148.5 -61.5t-61.5 -148.5q0 -29 5 -48q-129 7 -242 65t-192 155q-29 -50 -29 -106q0 -114 91 -175q-47 1 -100 26v-2q0 -75 50 -133.5t123 -72.5q-29 -8 -51 -8q-13 0 -39 4 q21 -63 74.5 -104t121.5 -42q-116 -90 -261 -90q-26 0 -50 3q148 -94 322 -94q112 0 210 35.5t168 95t120.5 137t75 162t24.5 168.5q0 18 -1 27q63 45 105 109zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5 t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf082;" d="M1307 618l23 219h-198v109q0 49 15.5 68.5t71.5 19.5h110v219h-175q-152 0 -218 -72t-66 -213v-131h-131v-219h131v-635h262v635h175zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960 q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf083;" horiz-adv-x="1792" d="M928 704q0 14 -9 23t-23 9q-66 0 -113 -47t-47 -113q0 -14 9 -23t23 -9t23 9t9 23q0 40 28 68t68 28q14 0 23 9t9 23zM1152 574q0 -106 -75 -181t-181 -75t-181 75t-75 181t75 181t181 75t181 -75t75 -181zM128 0h1536v128h-1536v-128zM1280 574q0 159 -112.5 271.5 t-271.5 112.5t-271.5 -112.5t-112.5 -271.5t112.5 -271.5t271.5 -112.5t271.5 112.5t112.5 271.5zM256 1216h384v128h-384v-128zM128 1024h1536v118v138h-828l-64 -128h-644v-128zM1792 1280v-1280q0 -53 -37.5 -90.5t-90.5 -37.5h-1536q-53 0 -90.5 37.5t-37.5 90.5v1280 q0 53 37.5 90.5t90.5 37.5h1536q53 0 90.5 -37.5t37.5 -90.5z" />
+<glyph unicode="&#xf084;" horiz-adv-x="1792" d="M832 1024q0 80 -56 136t-136 56t-136 -56t-56 -136q0 -42 19 -83q-41 19 -83 19q-80 0 -136 -56t-56 -136t56 -136t136 -56t136 56t56 136q0 42 -19 83q41 -19 83 -19q80 0 136 56t56 136zM1683 320q0 -17 -49 -66t-66 -49q-9 0 -28.5 16t-36.5 33t-38.5 40t-24.5 26 l-96 -96l220 -220q28 -28 28 -68q0 -42 -39 -81t-81 -39q-40 0 -68 28l-671 671q-176 -131 -365 -131q-163 0 -265.5 102.5t-102.5 265.5q0 160 95 313t248 248t313 95q163 0 265.5 -102.5t102.5 -265.5q0 -189 -131 -365l355 -355l96 96q-3 3 -26 24.5t-40 38.5t-33 36.5 t-16 28.5q0 17 49 66t66 49q13 0 23 -10q6 -6 46 -44.5t82 -79.5t86.5 -86t73 -78t28.5 -41z" />
+<glyph unicode="&#xf085;" horiz-adv-x="1920" d="M896 640q0 106 -75 181t-181 75t-181 -75t-75 -181t75 -181t181 -75t181 75t75 181zM1664 128q0 52 -38 90t-90 38t-90 -38t-38 -90q0 -53 37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1664 1152q0 52 -38 90t-90 38t-90 -38t-38 -90q0 -53 37.5 -90.5t90.5 -37.5 t90.5 37.5t37.5 90.5zM1280 731v-185q0 -10 -7 -19.5t-16 -10.5l-155 -24q-11 -35 -32 -76q34 -48 90 -115q7 -10 7 -20q0 -12 -7 -19q-23 -30 -82.5 -89.5t-78.5 -59.5q-11 0 -21 7l-115 90q-37 -19 -77 -31q-11 -108 -23 -155q-7 -24 -30 -24h-186q-11 0 -20 7.5t-10 17.5 l-23 153q-34 10 -75 31l-118 -89q-7 -7 -20 -7q-11 0 -21 8q-144 133 -144 160q0 9 7 19q10 14 41 53t47 61q-23 44 -35 82l-152 24q-10 1 -17 9.5t-7 19.5v185q0 10 7 19.5t16 10.5l155 24q11 35 32 76q-34 48 -90 115q-7 11 -7 20q0 12 7 20q22 30 82 89t79 59q11 0 21 -7 l115 -90q34 18 77 32q11 108 23 154q7 24 30 24h186q11 0 20 -7.5t10 -17.5l23 -153q34 -10 75 -31l118 89q8 7 20 7q11 0 21 -8q144 -133 144 -160q0 -9 -7 -19q-12 -16 -42 -54t-45 -60q23 -48 34 -82l152 -23q10 -2 17 -10.5t7 -19.5zM1920 198v-140q0 -16 -149 -31 q-12 -27 -30 -52q51 -113 51 -138q0 -4 -4 -7q-122 -71 -124 -71q-8 0 -46 47t-52 68q-20 -2 -30 -2t-30 2q-14 -21 -52 -68t-46 -47q-2 0 -124 71q-4 3 -4 7q0 25 51 138q-18 25 -30 52q-149 15 -149 31v140q0 16 149 31q13 29 30 52q-51 113 -51 138q0 4 4 7q4 2 35 20 t59 34t30 16q8 0 46 -46.5t52 -67.5q20 2 30 2t30 -2q51 71 92 112l6 2q4 0 124 -70q4 -3 4 -7q0 -25 -51 -138q17 -23 30 -52q149 -15 149 -31zM1920 1222v-140q0 -16 -149 -31q-12 -27 -30 -52q51 -113 51 -138q0 -4 -4 -7q-122 -71 -124 -71q-8 0 -46 47t-52 68 q-20 -2 -30 -2t-30 2q-14 -21 -52 -68t-46 -47q-2 0 -124 71q-4 3 -4 7q0 25 51 138q-18 25 -30 52q-149 15 -149 31v140q0 16 149 31q13 29 30 52q-51 113 -51 138q0 4 4 7q4 2 35 20t59 34t30 16q8 0 46 -46.5t52 -67.5q20 2 30 2t30 -2q51 71 92 112l6 2q4 0 124 -70 q4 -3 4 -7q0 -25 -51 -138q17 -23 30 -52q149 -15 149 -31z" />
+<glyph unicode="&#xf086;" horiz-adv-x="1792" d="M1408 768q0 -139 -94 -257t-256.5 -186.5t-353.5 -68.5q-86 0 -176 16q-124 -88 -278 -128q-36 -9 -86 -16h-3q-11 0 -20.5 8t-11.5 21q-1 3 -1 6.5t0.5 6.5t2 6l2.5 5t3.5 5.5t4 5t4.5 5t4 4.5q5 6 23 25t26 29.5t22.5 29t25 38.5t20.5 44q-124 72 -195 177t-71 224 q0 139 94 257t256.5 186.5t353.5 68.5t353.5 -68.5t256.5 -186.5t94 -257zM1792 512q0 -120 -71 -224.5t-195 -176.5q10 -24 20.5 -44t25 -38.5t22.5 -29t26 -29.5t23 -25q1 -1 4 -4.5t4.5 -5t4 -5t3.5 -5.5l2.5 -5t2 -6t0.5 -6.5t-1 -6.5q-3 -14 -13 -22t-22 -7 q-50 7 -86 16q-154 40 -278 128q-90 -16 -176 -16q-271 0 -472 132q58 -4 88 -4q161 0 309 45t264 129q125 92 192 212t67 254q0 77 -23 152q129 -71 204 -178t75 -230z" />
+<glyph unicode="&#xf087;" d="M256 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 768q0 51 -39 89.5t-89 38.5h-352q0 58 48 159.5t48 160.5q0 98 -32 145t-128 47q-26 -26 -38 -85t-30.5 -125.5t-59.5 -109.5q-22 -23 -77 -91q-4 -5 -23 -30t-31.5 -41t-34.5 -42.5 t-40 -44t-38.5 -35.5t-40 -27t-35.5 -9h-32v-640h32q13 0 31.5 -3t33 -6.5t38 -11t35 -11.5t35.5 -12.5t29 -10.5q211 -73 342 -73h121q192 0 192 167q0 26 -5 56q30 16 47.5 52.5t17.5 73.5t-18 69q53 50 53 119q0 25 -10 55.5t-25 47.5q32 1 53.5 47t21.5 81zM1536 769 q0 -89 -49 -163q9 -33 9 -69q0 -77 -38 -144q3 -21 3 -43q0 -101 -60 -178q1 -139 -85 -219.5t-227 -80.5h-36h-93q-96 0 -189.5 22.5t-216.5 65.5q-116 40 -138 40h-288q-53 0 -90.5 37.5t-37.5 90.5v640q0 53 37.5 90.5t90.5 37.5h274q36 24 137 155q58 75 107 128 q24 25 35.5 85.5t30.5 126.5t62 108q39 37 90 37q84 0 151 -32.5t102 -101.5t35 -186q0 -93 -48 -192h176q104 0 180 -76t76 -179z" />
+<glyph unicode="&#xf088;" d="M256 1088q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 512q0 35 -21.5 81t-53.5 47q15 17 25 47.5t10 55.5q0 69 -53 119q18 32 18 69t-17.5 73.5t-47.5 52.5q5 30 5 56q0 85 -49 126t-136 41h-128q-131 0 -342 -73q-5 -2 -29 -10.5 t-35.5 -12.5t-35 -11.5t-38 -11t-33 -6.5t-31.5 -3h-32v-640h32q16 0 35.5 -9t40 -27t38.5 -35.5t40 -44t34.5 -42.5t31.5 -41t23 -30q55 -68 77 -91q41 -43 59.5 -109.5t30.5 -125.5t38 -85q96 0 128 47t32 145q0 59 -48 160.5t-48 159.5h352q50 0 89 38.5t39 89.5z M1536 511q0 -103 -76 -179t-180 -76h-176q48 -99 48 -192q0 -118 -35 -186q-35 -69 -102 -101.5t-151 -32.5q-51 0 -90 37q-34 33 -54 82t-25.5 90.5t-17.5 84.5t-31 64q-48 50 -107 127q-101 131 -137 155h-274q-53 0 -90.5 37.5t-37.5 90.5v640q0 53 37.5 90.5t90.5 37.5 h288q22 0 138 40q128 44 223 66t200 22h112q140 0 226.5 -79t85.5 -216v-5q60 -77 60 -178q0 -22 -3 -43q38 -67 38 -144q0 -36 -9 -69q49 -74 49 -163z" />
+<glyph unicode="&#xf089;" horiz-adv-x="896" d="M832 1504v-1339l-449 -236q-22 -12 -40 -12q-21 0 -31.5 14.5t-10.5 35.5q0 6 2 20l86 500l-364 354q-25 27 -25 48q0 37 56 46l502 73l225 455q19 41 49 41z" />
+<glyph unicode="&#xf08a;" horiz-adv-x="1792" d="M1664 940q0 81 -21.5 143t-55 98.5t-81.5 59.5t-94 31t-98 8t-112 -25.5t-110.5 -64t-86.5 -72t-60 -61.5q-18 -22 -49 -22t-49 22q-24 28 -60 61.5t-86.5 72t-110.5 64t-112 25.5t-98 -8t-94 -31t-81.5 -59.5t-55 -98.5t-21.5 -143q0 -168 187 -355l581 -560l580 559 q188 188 188 356zM1792 940q0 -221 -229 -450l-623 -600q-18 -18 -44 -18t-44 18l-624 602q-10 8 -27.5 26t-55.5 65.5t-68 97.5t-53.5 121t-23.5 138q0 220 127 344t351 124q62 0 126.5 -21.5t120 -58t95.5 -68.5t76 -68q36 36 76 68t95.5 68.5t120 58t126.5 21.5 q224 0 351 -124t127 -344z" />
+<glyph unicode="&#xf08b;" horiz-adv-x="1664" d="M640 96q0 -4 1 -20t0.5 -26.5t-3 -23.5t-10 -19.5t-20.5 -6.5h-320q-119 0 -203.5 84.5t-84.5 203.5v704q0 119 84.5 203.5t203.5 84.5h320q13 0 22.5 -9.5t9.5 -22.5q0 -4 1 -20t0.5 -26.5t-3 -23.5t-10 -19.5t-20.5 -6.5h-320q-66 0 -113 -47t-47 -113v-704 q0 -66 47 -113t113 -47h288h11h13t11.5 -1t11.5 -3t8 -5.5t7 -9t2 -13.5zM1568 640q0 -26 -19 -45l-544 -544q-19 -19 -45 -19t-45 19t-19 45v288h-448q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h448v288q0 26 19 45t45 19t45 -19l544 -544q19 -19 19 -45z" />
+<glyph unicode="&#xf08c;" d="M237 122h231v694h-231v-694zM483 1030q-1 52 -36 86t-93 34t-94.5 -34t-36.5 -86q0 -51 35.5 -85.5t92.5 -34.5h1q59 0 95 34.5t36 85.5zM1068 122h231v398q0 154 -73 233t-193 79q-136 0 -209 -117h2v101h-231q3 -66 0 -694h231v388q0 38 7 56q15 35 45 59.5t74 24.5 q116 0 116 -157v-371zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf08d;" horiz-adv-x="1152" d="M480 672v448q0 14 -9 23t-23 9t-23 -9t-9 -23v-448q0 -14 9 -23t23 -9t23 9t9 23zM1152 320q0 -26 -19 -45t-45 -19h-429l-51 -483q-2 -12 -10.5 -20.5t-20.5 -8.5h-1q-27 0 -32 27l-76 485h-404q-26 0 -45 19t-19 45q0 123 78.5 221.5t177.5 98.5v512q-52 0 -90 38 t-38 90t38 90t90 38h640q52 0 90 -38t38 -90t-38 -90t-90 -38v-512q99 0 177.5 -98.5t78.5 -221.5z" />
+<glyph unicode="&#xf08e;" horiz-adv-x="1792" d="M1408 608v-320q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h704q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-704q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832q66 0 113 47t47 113v320 q0 14 9 23t23 9h64q14 0 23 -9t9 -23zM1792 1472v-512q0 -26 -19 -45t-45 -19t-45 19l-176 176l-652 -652q-10 -10 -23 -10t-23 10l-114 114q-10 10 -10 23t10 23l652 652l-176 176q-19 19 -19 45t19 45t45 19h512q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf090;" d="M1184 640q0 -26 -19 -45l-544 -544q-19 -19 -45 -19t-45 19t-19 45v288h-448q-26 0 -45 19t-19 45v384q0 26 19 45t45 19h448v288q0 26 19 45t45 19t45 -19l544 -544q19 -19 19 -45zM1536 992v-704q0 -119 -84.5 -203.5t-203.5 -84.5h-320q-13 0 -22.5 9.5t-9.5 22.5 q0 4 -1 20t-0.5 26.5t3 23.5t10 19.5t20.5 6.5h320q66 0 113 47t47 113v704q0 66 -47 113t-113 47h-288h-11h-13t-11.5 1t-11.5 3t-8 5.5t-7 9t-2 13.5q0 4 -1 20t-0.5 26.5t3 23.5t10 19.5t20.5 6.5h320q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf091;" horiz-adv-x="1664" d="M458 653q-74 162 -74 371h-256v-96q0 -78 94.5 -162t235.5 -113zM1536 928v96h-256q0 -209 -74 -371q141 29 235.5 113t94.5 162zM1664 1056v-128q0 -71 -41.5 -143t-112 -130t-173 -97.5t-215.5 -44.5q-42 -54 -95 -95q-38 -34 -52.5 -72.5t-14.5 -89.5q0 -54 30.5 -91 t97.5 -37q75 0 133.5 -45.5t58.5 -114.5v-64q0 -14 -9 -23t-23 -9h-832q-14 0 -23 9t-9 23v64q0 69 58.5 114.5t133.5 45.5q67 0 97.5 37t30.5 91q0 51 -14.5 89.5t-52.5 72.5q-53 41 -95 95q-113 5 -215.5 44.5t-173 97.5t-112 130t-41.5 143v128q0 40 28 68t68 28h288v96 q0 66 47 113t113 47h576q66 0 113 -47t47 -113v-96h288q40 0 68 -28t28 -68z" />
+<glyph unicode="&#xf092;" d="M394 184q-8 -9 -20 3q-13 11 -4 19q8 9 20 -3q12 -11 4 -19zM352 245q9 -12 0 -19q-8 -6 -17 7t0 18q9 7 17 -6zM291 305q-5 -7 -13 -2q-10 5 -7 12q3 5 13 2q10 -5 7 -12zM322 271q-6 -7 -16 3q-9 11 -2 16q6 6 16 -3q9 -11 2 -16zM451 159q-4 -12 -19 -6q-17 4 -13 15 t19 7q16 -5 13 -16zM514 154q0 -11 -16 -11q-17 -2 -17 11q0 11 16 11q17 2 17 -11zM572 164q2 -10 -14 -14t-18 8t14 15q16 2 18 -9zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-224q-16 0 -24.5 1t-19.5 5t-16 14.5t-5 27.5v239q0 97 -52 142q57 6 102.5 18t94 39 t81 66.5t53 105t20.5 150.5q0 121 -79 206q37 91 -8 204q-28 9 -81 -11t-92 -44l-38 -24q-93 26 -192 26t-192 -26q-16 11 -42.5 27t-83.5 38.5t-86 13.5q-44 -113 -7 -204q-79 -85 -79 -206q0 -85 20.5 -150t52.5 -105t80.5 -67t94 -39t102.5 -18q-40 -36 -49 -103 q-21 -10 -45 -15t-57 -5t-65.5 21.5t-55.5 62.5q-19 32 -48.5 52t-49.5 24l-20 3q-21 0 -29 -4.5t-5 -11.5t9 -14t13 -12l7 -5q22 -10 43.5 -38t31.5 -51l10 -23q13 -38 44 -61.5t67 -30t69.5 -7t55.5 3.5l23 4q0 -38 0.5 -103t0.5 -68q0 -22 -11 -33.5t-22 -13t-33 -1.5 h-224q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf093;" horiz-adv-x="1664" d="M1280 64q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1536 64q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1664 288v-320q0 -40 -28 -68t-68 -28h-1472q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h427q21 -56 70.5 -92 t110.5 -36h256q61 0 110.5 36t70.5 92h427q40 0 68 -28t28 -68zM1339 936q-17 -40 -59 -40h-256v-448q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v448h-256q-42 0 -59 40q-17 39 14 69l448 448q18 19 45 19t45 -19l448 -448q31 -30 14 -69z" />
+<glyph unicode="&#xf094;" d="M1407 710q0 44 -7 113.5t-18 96.5q-12 30 -17 44t-9 36.5t-4 48.5q0 23 5 68.5t5 67.5q0 37 -10 55q-4 1 -13 1q-19 0 -58 -4.5t-59 -4.5q-60 0 -176 24t-175 24q-43 0 -94.5 -11.5t-85 -23.5t-89.5 -34q-137 -54 -202 -103q-96 -73 -159.5 -189.5t-88 -236t-24.5 -248.5 q0 -40 12.5 -120t12.5 -121q0 -23 -11 -66.5t-11 -65.5t12 -36.5t34 -14.5q24 0 72.5 11t73.5 11q57 0 169.5 -15.5t169.5 -15.5q181 0 284 36q129 45 235.5 152.5t166 245.5t59.5 275zM1535 712q0 -165 -70 -327.5t-196 -288t-281 -180.5q-124 -44 -326 -44 q-57 0 -170 14.5t-169 14.5q-24 0 -72.5 -14.5t-73.5 -14.5q-73 0 -123.5 55.5t-50.5 128.5q0 24 11 68t11 67q0 40 -12.5 120.5t-12.5 121.5q0 111 18 217.5t54.5 209.5t100.5 194t150 156q78 59 232 120q194 78 316 78q60 0 175.5 -24t173.5 -24q19 0 57 5t58 5 q81 0 118 -50.5t37 -134.5q0 -23 -5 -68t-5 -68q0 -10 1 -18.5t3 -17t4 -13.5t6.5 -16t6.5 -17q16 -40 25 -118.5t9 -136.5z" />
+<glyph unicode="&#xf095;" horiz-adv-x="1408" d="M1408 296q0 -27 -10 -70.5t-21 -68.5q-21 -50 -122 -106q-94 -51 -186 -51q-27 0 -52.5 3.5t-57.5 12.5t-47.5 14.5t-55.5 20.5t-49 18q-98 35 -175 83q-128 79 -264.5 215.5t-215.5 264.5q-48 77 -83 175q-3 9 -18 49t-20.5 55.5t-14.5 47.5t-12.5 57.5t-3.5 52.5 q0 92 51 186q56 101 106 122q25 11 68.5 21t70.5 10q14 0 21 -3q18 -6 53 -76q11 -19 30 -54t35 -63.5t31 -53.5q3 -4 17.5 -25t21.5 -35.5t7 -28.5q0 -20 -28.5 -50t-62 -55t-62 -53t-28.5 -46q0 -9 5 -22.5t8.5 -20.5t14 -24t11.5 -19q76 -137 174 -235t235 -174 q2 -1 19 -11.5t24 -14t20.5 -8.5t22.5 -5q18 0 46 28.5t53 62t55 62t50 28.5q14 0 28.5 -7t35.5 -21.5t25 -17.5q25 -15 53.5 -31t63.5 -35t54 -30q70 -35 76 -53q3 -7 3 -21z" />
+<glyph unicode="&#xf096;" horiz-adv-x="1408" d="M1120 1280h-832q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832q66 0 113 47t47 113v832q0 66 -47 113t-113 47zM1408 1120v-832q0 -119 -84.5 -203.5t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h832 q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf097;" horiz-adv-x="1280" d="M1152 1280h-1024v-1242l423 406l89 85l89 -85l423 -406v1242zM1164 1408q23 0 44 -9q33 -13 52.5 -41t19.5 -62v-1289q0 -34 -19.5 -62t-52.5 -41q-19 -8 -44 -8q-48 0 -83 32l-441 424l-441 -424q-36 -33 -83 -33q-23 0 -44 9q-33 13 -52.5 41t-19.5 62v1289 q0 34 19.5 62t52.5 41q21 9 44 9h1048z" />
+<glyph unicode="&#xf098;" d="M1280 343q0 11 -2 16q-3 8 -38.5 29.5t-88.5 49.5l-53 29q-5 3 -19 13t-25 15t-21 5q-18 0 -47 -32.5t-57 -65.5t-44 -33q-7 0 -16.5 3.5t-15.5 6.5t-17 9.5t-14 8.5q-99 55 -170.5 126.5t-126.5 170.5q-2 3 -8.5 14t-9.5 17t-6.5 15.5t-3.5 16.5q0 13 20.5 33.5t45 38.5 t45 39.5t20.5 36.5q0 10 -5 21t-15 25t-13 19q-3 6 -15 28.5t-25 45.5t-26.5 47.5t-25 40.5t-16.5 18t-16 2q-48 0 -101 -22q-46 -21 -80 -94.5t-34 -130.5q0 -16 2.5 -34t5 -30.5t9 -33t10 -29.5t12.5 -33t11 -30q60 -164 216.5 -320.5t320.5 -216.5q6 -2 30 -11t33 -12.5 t29.5 -10t33 -9t30.5 -5t34 -2.5q57 0 130.5 34t94.5 80q22 53 22 101zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf099;" horiz-adv-x="1664" d="M1620 1128q-67 -98 -162 -167q1 -14 1 -42q0 -130 -38 -259.5t-115.5 -248.5t-184.5 -210.5t-258 -146t-323 -54.5q-271 0 -496 145q35 -4 78 -4q225 0 401 138q-105 2 -188 64.5t-114 159.5q33 -5 61 -5q43 0 85 11q-112 23 -185.5 111.5t-73.5 205.5v4q68 -38 146 -41 q-66 44 -105 115t-39 154q0 88 44 163q121 -149 294.5 -238.5t371.5 -99.5q-8 38 -8 74q0 134 94.5 228.5t228.5 94.5q140 0 236 -102q109 21 205 78q-37 -115 -142 -178q93 10 186 50z" />
+<glyph unicode="&#xf09a;" horiz-adv-x="768" d="M511 980h257l-30 -284h-227v-824h-341v824h-170v284h170v171q0 182 86 275.5t283 93.5h227v-284h-142q-39 0 -62.5 -6.5t-34 -23.5t-13.5 -34.5t-3 -49.5v-142z" />
+<glyph unicode="&#xf09b;" d="M1536 640q0 -251 -146.5 -451.5t-378.5 -277.5q-27 -5 -39.5 7t-12.5 30v211q0 97 -52 142q57 6 102.5 18t94 39t81 66.5t53 105t20.5 150.5q0 121 -79 206q37 91 -8 204q-28 9 -81 -11t-92 -44l-38 -24q-93 26 -192 26t-192 -26q-16 11 -42.5 27t-83.5 38.5t-86 13.5 q-44 -113 -7 -204q-79 -85 -79 -206q0 -85 20.5 -150t52.5 -105t80.5 -67t94 -39t102.5 -18q-40 -36 -49 -103q-21 -10 -45 -15t-57 -5t-65.5 21.5t-55.5 62.5q-19 32 -48.5 52t-49.5 24l-20 3q-21 0 -29 -4.5t-5 -11.5t9 -14t13 -12l7 -5q22 -10 43.5 -38t31.5 -51l10 -23 q13 -38 44 -61.5t67 -30t69.5 -7t55.5 3.5l23 4q0 -38 0.5 -89t0.5 -54q0 -18 -13 -30t-40 -7q-232 77 -378.5 277.5t-146.5 451.5q0 209 103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf09c;" horiz-adv-x="1664" d="M1664 960v-256q0 -26 -19 -45t-45 -19h-64q-26 0 -45 19t-19 45v256q0 106 -75 181t-181 75t-181 -75t-75 -181v-192h96q40 0 68 -28t28 -68v-576q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v576q0 40 28 68t68 28h672v192q0 185 131.5 316.5t316.5 131.5 t316.5 -131.5t131.5 -316.5z" />
+<glyph unicode="&#xf09d;" horiz-adv-x="1920" d="M1760 1408q66 0 113 -47t47 -113v-1216q0 -66 -47 -113t-113 -47h-1600q-66 0 -113 47t-47 113v1216q0 66 47 113t113 47h1600zM160 1280q-13 0 -22.5 -9.5t-9.5 -22.5v-224h1664v224q0 13 -9.5 22.5t-22.5 9.5h-1600zM1760 0q13 0 22.5 9.5t9.5 22.5v608h-1664v-608 q0 -13 9.5 -22.5t22.5 -9.5h1600zM256 128v128h256v-128h-256zM640 128v128h384v-128h-384z" />
+<glyph unicode="&#xf09e;" horiz-adv-x="1408" d="M384 192q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM896 69q2 -28 -17 -48q-18 -21 -47 -21h-135q-25 0 -43 16.5t-20 41.5q-22 229 -184.5 391.5t-391.5 184.5q-25 2 -41.5 20t-16.5 43v135q0 29 21 47q17 17 43 17h5q160 -13 306 -80.5 t259 -181.5q114 -113 181.5 -259t80.5 -306zM1408 67q2 -27 -18 -47q-18 -20 -46 -20h-143q-26 0 -44.5 17.5t-19.5 42.5q-12 215 -101 408.5t-231.5 336t-336 231.5t-408.5 102q-25 1 -42.5 19.5t-17.5 43.5v143q0 28 20 46q18 18 44 18h3q262 -13 501.5 -120t425.5 -294 q187 -186 294 -425.5t120 -501.5z" />
+<glyph unicode="&#xf0a0;" d="M1040 320q0 -33 -23.5 -56.5t-56.5 -23.5t-56.5 23.5t-23.5 56.5t23.5 56.5t56.5 23.5t56.5 -23.5t23.5 -56.5zM1296 320q0 -33 -23.5 -56.5t-56.5 -23.5t-56.5 23.5t-23.5 56.5t23.5 56.5t56.5 23.5t56.5 -23.5t23.5 -56.5zM1408 160v320q0 13 -9.5 22.5t-22.5 9.5 h-1216q-13 0 -22.5 -9.5t-9.5 -22.5v-320q0 -13 9.5 -22.5t22.5 -9.5h1216q13 0 22.5 9.5t9.5 22.5zM178 640h1180l-157 482q-4 13 -16 21.5t-26 8.5h-782q-14 0 -26 -8.5t-16 -21.5zM1536 480v-320q0 -66 -47 -113t-113 -47h-1216q-66 0 -113 47t-47 113v320q0 25 16 75 l197 606q17 53 63 86t101 33h782q55 0 101 -33t63 -86l197 -606q16 -50 16 -75z" />
+<glyph unicode="&#xf0a1;" horiz-adv-x="1792" d="M1664 896q53 0 90.5 -37.5t37.5 -90.5t-37.5 -90.5t-90.5 -37.5v-384q0 -52 -38 -90t-90 -38q-417 347 -812 380q-58 -19 -91 -66t-31 -100.5t40 -92.5q-20 -33 -23 -65.5t6 -58t33.5 -55t48 -50t61.5 -50.5q-29 -58 -111.5 -83t-168.5 -11.5t-132 55.5q-7 23 -29.5 87.5 t-32 94.5t-23 89t-15 101t3.5 98.5t22 110.5h-122q-66 0 -113 47t-47 113v192q0 66 47 113t113 47h480q435 0 896 384q52 0 90 -38t38 -90v-384zM1536 292v954q-394 -302 -768 -343v-270q377 -42 768 -341z" />
+<glyph unicode="&#xf0a2;" horiz-adv-x="1664" d="M848 -160q0 16 -16 16q-59 0 -101.5 42.5t-42.5 101.5q0 16 -16 16t-16 -16q0 -73 51.5 -124.5t124.5 -51.5q16 0 16 16zM183 128h1298q-164 181 -246.5 411.5t-82.5 484.5q0 256 -320 256t-320 -256q0 -254 -82.5 -484.5t-246.5 -411.5zM1664 128q0 -52 -38 -90t-90 -38 h-448q0 -106 -75 -181t-181 -75t-181 75t-75 181h-448q-52 0 -90 38t-38 90q190 161 287 397.5t97 498.5q0 165 96 262t264 117q-8 18 -8 37q0 40 28 68t68 28t68 -28t28 -68q0 -19 -8 -37q168 -20 264 -117t96 -262q0 -262 97 -498.5t287 -397.5z" />
+<glyph unicode="&#xf0a3;" d="M1376 640l138 -135q30 -28 20 -70q-12 -41 -52 -51l-188 -48l53 -186q12 -41 -19 -70q-29 -31 -70 -19l-186 53l-48 -188q-10 -40 -51 -52q-12 -2 -19 -2q-31 0 -51 22l-135 138l-135 -138q-28 -30 -70 -20q-41 11 -51 52l-48 188l-186 -53q-41 -12 -70 19q-31 29 -19 70 l53 186l-188 48q-40 10 -52 51q-10 42 20 70l138 135l-138 135q-30 28 -20 70q12 41 52 51l188 48l-53 186q-12 41 19 70q29 31 70 19l186 -53l48 188q10 41 51 51q41 12 70 -19l135 -139l135 139q29 30 70 19q41 -10 51 -51l48 -188l186 53q41 12 70 -19q31 -29 19 -70 l-53 -186l188 -48q40 -10 52 -51q10 -42 -20 -70z" />
+<glyph unicode="&#xf0a4;" horiz-adv-x="1792" d="M256 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1664 768q0 51 -39 89.5t-89 38.5h-576q0 20 15 48.5t33 55t33 68t15 84.5q0 67 -44.5 97.5t-115.5 30.5q-24 0 -90 -139q-24 -44 -37 -65q-40 -64 -112 -145q-71 -81 -101 -106 q-69 -57 -140 -57h-32v-640h32q72 0 167 -32t193.5 -64t179.5 -32q189 0 189 167q0 26 -5 56q30 16 47.5 52.5t17.5 73.5t-18 69q53 50 53 119q0 25 -10 55.5t-25 47.5h331q52 0 90 38t38 90zM1792 769q0 -105 -75.5 -181t-180.5 -76h-169q-4 -62 -37 -119q3 -21 3 -43 q0 -101 -60 -178q1 -139 -85 -219.5t-227 -80.5q-133 0 -322 69q-164 59 -223 59h-288q-53 0 -90.5 37.5t-37.5 90.5v640q0 53 37.5 90.5t90.5 37.5h288q10 0 21.5 4.5t23.5 14t22.5 18t24 22.5t20.5 21.5t19 21.5t14 17q65 74 100 129q13 21 33 62t37 72t40.5 63t55 49.5 t69.5 17.5q125 0 206.5 -67t81.5 -189q0 -68 -22 -128h374q104 0 180 -76t76 -179z" />
+<glyph unicode="&#xf0a5;" horiz-adv-x="1792" d="M1376 128h32v640h-32q-35 0 -67.5 12t-62.5 37t-50 46t-49 54q-2 3 -3.5 4.5t-4 4.5t-4.5 5q-72 81 -112 145q-14 22 -38 68q-1 3 -10.5 22.5t-18.5 36t-20 35.5t-21.5 30.5t-18.5 11.5q-71 0 -115.5 -30.5t-44.5 -97.5q0 -43 15 -84.5t33 -68t33 -55t15 -48.5h-576 q-50 0 -89 -38.5t-39 -89.5q0 -52 38 -90t90 -38h331q-15 -17 -25 -47.5t-10 -55.5q0 -69 53 -119q-18 -32 -18 -69t17.5 -73.5t47.5 -52.5q-4 -24 -4 -56q0 -85 48.5 -126t135.5 -41q84 0 183 32t194 64t167 32zM1664 192q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45 t45 -19t45 19t19 45zM1792 768v-640q0 -53 -37.5 -90.5t-90.5 -37.5h-288q-59 0 -223 -59q-190 -69 -317 -69q-142 0 -230 77.5t-87 217.5l1 5q-61 76 -61 178q0 22 3 43q-33 57 -37 119h-169q-105 0 -180.5 76t-75.5 181q0 103 76 179t180 76h374q-22 60 -22 128 q0 122 81.5 189t206.5 67q38 0 69.5 -17.5t55 -49.5t40.5 -63t37 -72t33 -62q35 -55 100 -129q2 -3 14 -17t19 -21.5t20.5 -21.5t24 -22.5t22.5 -18t23.5 -14t21.5 -4.5h288q53 0 90.5 -37.5t37.5 -90.5z" />
+<glyph unicode="&#xf0a6;" d="M1280 -64q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 700q0 189 -167 189q-26 0 -56 -5q-16 30 -52.5 47.5t-73.5 17.5t-69 -18q-50 53 -119 53q-25 0 -55.5 -10t-47.5 -25v331q0 52 -38 90t-90 38q-51 0 -89.5 -39t-38.5 -89v-576 q-20 0 -48.5 15t-55 33t-68 33t-84.5 15q-67 0 -97.5 -44.5t-30.5 -115.5q0 -24 139 -90q44 -24 65 -37q64 -40 145 -112q81 -71 106 -101q57 -69 57 -140v-32h640v32q0 72 32 167t64 193.5t32 179.5zM1536 705q0 -133 -69 -322q-59 -164 -59 -223v-288q0 -53 -37.5 -90.5 t-90.5 -37.5h-640q-53 0 -90.5 37.5t-37.5 90.5v288q0 10 -4.5 21.5t-14 23.5t-18 22.5t-22.5 24t-21.5 20.5t-21.5 19t-17 14q-74 65 -129 100q-21 13 -62 33t-72 37t-63 40.5t-49.5 55t-17.5 69.5q0 125 67 206.5t189 81.5q68 0 128 -22v374q0 104 76 180t179 76 q105 0 181 -75.5t76 -180.5v-169q62 -4 119 -37q21 3 43 3q101 0 178 -60q139 1 219.5 -85t80.5 -227z" />
+<glyph unicode="&#xf0a7;" d="M1408 576q0 84 -32 183t-64 194t-32 167v32h-640v-32q0 -35 -12 -67.5t-37 -62.5t-46 -50t-54 -49q-9 -8 -14 -12q-81 -72 -145 -112q-22 -14 -68 -38q-3 -1 -22.5 -10.5t-36 -18.5t-35.5 -20t-30.5 -21.5t-11.5 -18.5q0 -71 30.5 -115.5t97.5 -44.5q43 0 84.5 15t68 33 t55 33t48.5 15v-576q0 -50 38.5 -89t89.5 -39q52 0 90 38t38 90v331q46 -35 103 -35q69 0 119 53q32 -18 69 -18t73.5 17.5t52.5 47.5q24 -4 56 -4q85 0 126 48.5t41 135.5zM1280 1344q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1536 580 q0 -142 -77.5 -230t-217.5 -87l-5 1q-76 -61 -178 -61q-22 0 -43 3q-54 -30 -119 -37v-169q0 -105 -76 -180.5t-181 -75.5q-103 0 -179 76t-76 180v374q-54 -22 -128 -22q-121 0 -188.5 81.5t-67.5 206.5q0 38 17.5 69.5t49.5 55t63 40.5t72 37t62 33q55 35 129 100 q3 2 17 14t21.5 19t21.5 20.5t22.5 24t18 22.5t14 23.5t4.5 21.5v288q0 53 37.5 90.5t90.5 37.5h640q53 0 90.5 -37.5t37.5 -90.5v-288q0 -59 59 -223q69 -190 69 -317z" />
+<glyph unicode="&#xf0a8;" d="M1280 576v128q0 26 -19 45t-45 19h-502l189 189q19 19 19 45t-19 45l-91 91q-18 18 -45 18t-45 -18l-362 -362l-91 -91q-18 -18 -18 -45t18 -45l91 -91l362 -362q18 -18 45 -18t45 18l91 91q18 18 18 45t-18 45l-189 189h502q26 0 45 19t19 45zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf0a9;" d="M1285 640q0 27 -18 45l-91 91l-362 362q-18 18 -45 18t-45 -18l-91 -91q-18 -18 -18 -45t18 -45l189 -189h-502q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h502l-189 -189q-19 -19 -19 -45t19 -45l91 -91q18 -18 45 -18t45 18l362 362l91 91q18 18 18 45zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf0aa;" d="M1284 641q0 27 -18 45l-362 362l-91 91q-18 18 -45 18t-45 -18l-91 -91l-362 -362q-18 -18 -18 -45t18 -45l91 -91q18 -18 45 -18t45 18l189 189v-502q0 -26 19 -45t45 -19h128q26 0 45 19t19 45v502l189 -189q19 -19 45 -19t45 19l91 91q18 18 18 45zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf0ab;" d="M1284 639q0 27 -18 45l-91 91q-18 18 -45 18t-45 -18l-189 -189v502q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-502l-189 189q-19 19 -45 19t-45 -19l-91 -91q-18 -18 -18 -45t18 -45l362 -362l91 -91q18 -18 45 -18t45 18l91 91l362 362q18 18 18 45zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf0ac;" d="M768 1408q209 0 385.5 -103t279.5 -279.5t103 -385.5t-103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103zM1042 887q-2 -1 -9.5 -9.5t-13.5 -9.5q2 0 4.5 5t5 11t3.5 7q6 7 22 15q14 6 52 12q34 8 51 -11 q-2 2 9.5 13t14.5 12q3 2 15 4.5t15 7.5l2 22q-12 -1 -17.5 7t-6.5 21q0 -2 -6 -8q0 7 -4.5 8t-11.5 -1t-9 -1q-10 3 -15 7.5t-8 16.5t-4 15q-2 5 -9.5 10.5t-9.5 10.5q-1 2 -2.5 5.5t-3 6.5t-4 5.5t-5.5 2.5t-7 -5t-7.5 -10t-4.5 -5q-3 2 -6 1.5t-4.5 -1t-4.5 -3t-5 -3.5 q-3 -2 -8.5 -3t-8.5 -2q15 5 -1 11q-10 4 -16 3q9 4 7.5 12t-8.5 14h5q-1 4 -8.5 8.5t-17.5 8.5t-13 6q-8 5 -34 9.5t-33 0.5q-5 -6 -4.5 -10.5t4 -14t3.5 -12.5q1 -6 -5.5 -13t-6.5 -12q0 -7 14 -15.5t10 -21.5q-3 -8 -16 -16t-16 -12q-5 -8 -1.5 -18.5t10.5 -16.5 q2 -2 1.5 -4t-3.5 -4.5t-5.5 -4t-6.5 -3.5l-3 -2q-11 -5 -20.5 6t-13.5 26q-7 25 -16 30q-23 8 -29 -1q-5 13 -41 26q-25 9 -58 4q6 1 0 15q-7 15 -19 12q3 6 4 17.5t1 13.5q3 13 12 23q1 1 7 8.5t9.5 13.5t0.5 6q35 -4 50 11q5 5 11.5 17t10.5 17q9 6 14 5.5t14.5 -5.5 t14.5 -5q14 -1 15.5 11t-7.5 20q12 -1 3 17q-5 7 -8 9q-12 4 -27 -5q-8 -4 2 -8q-1 1 -9.5 -10.5t-16.5 -17.5t-16 5q-1 1 -5.5 13.5t-9.5 13.5q-8 0 -16 -15q3 8 -11 15t-24 8q19 12 -8 27q-7 4 -20.5 5t-19.5 -4q-5 -7 -5.5 -11.5t5 -8t10.5 -5.5t11.5 -4t8.5 -3 q14 -10 8 -14q-2 -1 -8.5 -3.5t-11.5 -4.5t-6 -4q-3 -4 0 -14t-2 -14q-5 5 -9 17.5t-7 16.5q7 -9 -25 -6l-10 1q-4 0 -16 -2t-20.5 -1t-13.5 8q-4 8 0 20q1 4 4 2q-4 3 -11 9.5t-10 8.5q-46 -15 -94 -41q6 -1 12 1q5 2 13 6.5t10 5.5q34 14 42 7l5 5q14 -16 20 -25 q-7 4 -30 1q-20 -6 -22 -12q7 -12 5 -18q-4 3 -11.5 10t-14.5 11t-15 5q-16 0 -22 -1q-146 -80 -235 -222q7 -7 12 -8q4 -1 5 -9t2.5 -11t11.5 3q9 -8 3 -19q1 1 44 -27q19 -17 21 -21q3 -11 -10 -18q-1 2 -9 9t-9 4q-3 -5 0.5 -18.5t10.5 -12.5q-7 0 -9.5 -16t-2.5 -35.5 t-1 -23.5l2 -1q-3 -12 5.5 -34.5t21.5 -19.5q-13 -3 20 -43q6 -8 8 -9q3 -2 12 -7.5t15 -10t10 -10.5q4 -5 10 -22.5t14 -23.5q-2 -6 9.5 -20t10.5 -23q-1 0 -2.5 -1t-2.5 -1q3 -7 15.5 -14t15.5 -13q1 -3 2 -10t3 -11t8 -2q2 20 -24 62q-15 25 -17 29q-3 5 -5.5 15.5 t-4.5 14.5q2 0 6 -1.5t8.5 -3.5t7.5 -4t2 -3q-3 -7 2 -17.5t12 -18.5t17 -19t12 -13q6 -6 14 -19.5t0 -13.5q9 0 20 -10t17 -20q5 -8 8 -26t5 -24q2 -7 8.5 -13.5t12.5 -9.5l16 -8t13 -7q5 -2 18.5 -10.5t21.5 -11.5q10 -4 16 -4t14.5 2.5t13.5 3.5q15 2 29 -15t21 -21 q36 -19 55 -11q-2 -1 0.5 -7.5t8 -15.5t9 -14.5t5.5 -8.5q5 -6 18 -15t18 -15q6 4 7 9q-3 -8 7 -20t18 -10q14 3 14 32q-31 -15 -49 18q0 1 -2.5 5.5t-4 8.5t-2.5 8.5t0 7.5t5 3q9 0 10 3.5t-2 12.5t-4 13q-1 8 -11 20t-12 15q-5 -9 -16 -8t-16 9q0 -1 -1.5 -5.5t-1.5 -6.5 q-13 0 -15 1q1 3 2.5 17.5t3.5 22.5q1 4 5.5 12t7.5 14.5t4 12.5t-4.5 9.5t-17.5 2.5q-19 -1 -26 -20q-1 -3 -3 -10.5t-5 -11.5t-9 -7q-7 -3 -24 -2t-24 5q-13 8 -22.5 29t-9.5 37q0 10 2.5 26.5t3 25t-5.5 24.5q3 2 9 9.5t10 10.5q2 1 4.5 1.5t4.5 0t4 1.5t3 6q-1 1 -4 3 q-3 3 -4 3q7 -3 28.5 1.5t27.5 -1.5q15 -11 22 2q0 1 -2.5 9.5t-0.5 13.5q5 -27 29 -9q3 -3 15.5 -5t17.5 -5q3 -2 7 -5.5t5.5 -4.5t5 0.5t8.5 6.5q10 -14 12 -24q11 -40 19 -44q7 -3 11 -2t4.5 9.5t0 14t-1.5 12.5l-1 8v18l-1 8q-15 3 -18.5 12t1.5 18.5t15 18.5q1 1 8 3.5 t15.5 6.5t12.5 8q21 19 15 35q7 0 11 9q-1 0 -5 3t-7.5 5t-4.5 2q9 5 2 16q5 3 7.5 11t7.5 10q9 -12 21 -2q7 8 1 16q5 7 20.5 10.5t18.5 9.5q7 -2 8 2t1 12t3 12q4 5 15 9t13 5l17 11q3 4 0 4q18 -2 31 11q10 11 -6 20q3 6 -3 9.5t-15 5.5q3 1 11.5 0.5t10.5 1.5 q15 10 -7 16q-17 5 -43 -12zM879 10q206 36 351 189q-3 3 -12.5 4.5t-12.5 3.5q-18 7 -24 8q1 7 -2.5 13t-8 9t-12.5 8t-11 7q-2 2 -7 6t-7 5.5t-7.5 4.5t-8.5 2t-10 -1l-3 -1q-3 -1 -5.5 -2.5t-5.5 -3t-4 -3t0 -2.5q-21 17 -36 22q-5 1 -11 5.5t-10.5 7t-10 1.5t-11.5 -7 q-5 -5 -6 -15t-2 -13q-7 5 0 17.5t2 18.5q-3 6 -10.5 4.5t-12 -4.5t-11.5 -8.5t-9 -6.5t-8.5 -5.5t-8.5 -7.5q-3 -4 -6 -12t-5 -11q-2 4 -11.5 6.5t-9.5 5.5q2 -10 4 -35t5 -38q7 -31 -12 -48q-27 -25 -29 -40q-4 -22 12 -26q0 -7 -8 -20.5t-7 -21.5q0 -6 2 -16z" />
+<glyph unicode="&#xf0ad;" horiz-adv-x="1664" d="M384 64q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1028 484l-682 -682q-37 -37 -90 -37q-52 0 -91 37l-106 108q-38 36 -38 90q0 53 38 91l681 681q39 -98 114.5 -173.5t173.5 -114.5zM1662 919q0 -39 -23 -106q-47 -134 -164.5 -217.5 t-258.5 -83.5q-185 0 -316.5 131.5t-131.5 316.5t131.5 316.5t316.5 131.5q58 0 121.5 -16.5t107.5 -46.5q16 -11 16 -28t-16 -28l-293 -169v-224l193 -107q5 3 79 48.5t135.5 81t70.5 35.5q15 0 23.5 -10t8.5 -25z" />
+<glyph unicode="&#xf0ae;" horiz-adv-x="1792" d="M1024 128h640v128h-640v-128zM640 640h1024v128h-1024v-128zM1280 1152h384v128h-384v-128zM1792 320v-256q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 832v-256q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19 t-19 45v256q0 26 19 45t45 19h1664q26 0 45 -19t19 -45zM1792 1344v-256q0 -26 -19 -45t-45 -19h-1664q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h1664q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0b0;" horiz-adv-x="1408" d="M1403 1241q17 -41 -14 -70l-493 -493v-742q0 -42 -39 -59q-13 -5 -25 -5q-27 0 -45 19l-256 256q-19 19 -19 45v486l-493 493q-31 29 -14 70q17 39 59 39h1280q42 0 59 -39z" />
+<glyph unicode="&#xf0b1;" horiz-adv-x="1792" d="M640 1280h512v128h-512v-128zM1792 640v-480q0 -66 -47 -113t-113 -47h-1472q-66 0 -113 47t-47 113v480h672v-160q0 -26 19 -45t45 -19h320q26 0 45 19t19 45v160h672zM1024 640v-128h-256v128h256zM1792 1120v-384h-1792v384q0 66 47 113t113 47h352v160q0 40 28 68 t68 28h576q40 0 68 -28t28 -68v-160h352q66 0 113 -47t47 -113z" />
+<glyph unicode="&#xf0b2;" d="M1283 995l-355 -355l355 -355l144 144q29 31 70 14q39 -17 39 -59v-448q0 -26 -19 -45t-45 -19h-448q-42 0 -59 40q-17 39 14 69l144 144l-355 355l-355 -355l144 -144q31 -30 14 -69q-17 -40 -59 -40h-448q-26 0 -45 19t-19 45v448q0 42 40 59q39 17 69 -14l144 -144 l355 355l-355 355l-144 -144q-19 -19 -45 -19q-12 0 -24 5q-40 17 -40 59v448q0 26 19 45t45 19h448q42 0 59 -40q17 -39 -14 -69l-144 -144l355 -355l355 355l-144 144q-31 30 -14 69q17 40 59 40h448q26 0 45 -19t19 -45v-448q0 -42 -39 -59q-13 -5 -25 -5q-26 0 -45 19z " />
+<glyph unicode="&#xf0c0;" horiz-adv-x="1920" d="M593 640q-162 -5 -265 -128h-134q-82 0 -138 40.5t-56 118.5q0 353 124 353q6 0 43.5 -21t97.5 -42.5t119 -21.5q67 0 133 23q-5 -37 -5 -66q0 -139 81 -256zM1664 3q0 -120 -73 -189.5t-194 -69.5h-874q-121 0 -194 69.5t-73 189.5q0 53 3.5 103.5t14 109t26.5 108.5 t43 97.5t62 81t85.5 53.5t111.5 20q10 0 43 -21.5t73 -48t107 -48t135 -21.5t135 21.5t107 48t73 48t43 21.5q61 0 111.5 -20t85.5 -53.5t62 -81t43 -97.5t26.5 -108.5t14 -109t3.5 -103.5zM640 1280q0 -106 -75 -181t-181 -75t-181 75t-75 181t75 181t181 75t181 -75 t75 -181zM1344 896q0 -159 -112.5 -271.5t-271.5 -112.5t-271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5t271.5 -112.5t112.5 -271.5zM1920 671q0 -78 -56 -118.5t-138 -40.5h-134q-103 123 -265 128q81 117 81 256q0 29 -5 66q66 -23 133 -23q59 0 119 21.5t97.5 42.5 t43.5 21q124 0 124 -353zM1792 1280q0 -106 -75 -181t-181 -75t-181 75t-75 181t75 181t181 75t181 -75t75 -181z" />
+<glyph unicode="&#xf0c1;" horiz-adv-x="1664" d="M1456 320q0 40 -28 68l-208 208q-28 28 -68 28q-42 0 -72 -32q3 -3 19 -18.5t21.5 -21.5t15 -19t13 -25.5t3.5 -27.5q0 -40 -28 -68t-68 -28q-15 0 -27.5 3.5t-25.5 13t-19 15t-21.5 21.5t-18.5 19q-33 -31 -33 -73q0 -40 28 -68l206 -207q27 -27 68 -27q40 0 68 26 l147 146q28 28 28 67zM753 1025q0 40 -28 68l-206 207q-28 28 -68 28q-39 0 -68 -27l-147 -146q-28 -28 -28 -67q0 -40 28 -68l208 -208q27 -27 68 -27q42 0 72 31q-3 3 -19 18.5t-21.5 21.5t-15 19t-13 25.5t-3.5 27.5q0 40 28 68t68 28q15 0 27.5 -3.5t25.5 -13t19 -15 t21.5 -21.5t18.5 -19q33 31 33 73zM1648 320q0 -120 -85 -203l-147 -146q-83 -83 -203 -83q-121 0 -204 85l-206 207q-83 83 -83 203q0 123 88 209l-88 88q-86 -88 -208 -88q-120 0 -204 84l-208 208q-84 84 -84 204t85 203l147 146q83 83 203 83q121 0 204 -85l206 -207 q83 -83 83 -203q0 -123 -88 -209l88 -88q86 88 208 88q120 0 204 -84l208 -208q84 -84 84 -204z" />
+<glyph unicode="&#xf0c2;" horiz-adv-x="1920" d="M1920 384q0 -159 -112.5 -271.5t-271.5 -112.5h-1088q-185 0 -316.5 131.5t-131.5 316.5q0 132 71 241.5t187 163.5q-2 28 -2 43q0 212 150 362t362 150q158 0 286.5 -88t187.5 -230q70 62 166 62q106 0 181 -75t75 -181q0 -75 -41 -138q129 -30 213 -134.5t84 -239.5z " />
+<glyph unicode="&#xf0c3;" horiz-adv-x="1664" d="M1527 88q56 -89 21.5 -152.5t-140.5 -63.5h-1152q-106 0 -140.5 63.5t21.5 152.5l503 793v399h-64q-26 0 -45 19t-19 45t19 45t45 19h512q26 0 45 -19t19 -45t-19 -45t-45 -19h-64v-399zM748 813l-272 -429h712l-272 429l-20 31v37v399h-128v-399v-37z" />
+<glyph unicode="&#xf0c4;" horiz-adv-x="1792" d="M960 640q26 0 45 -19t19 -45t-19 -45t-45 -19t-45 19t-19 45t19 45t45 19zM1260 576l507 -398q28 -20 25 -56q-5 -35 -35 -51l-128 -64q-13 -7 -29 -7q-17 0 -31 8l-690 387l-110 -66q-8 -4 -12 -5q14 -49 10 -97q-7 -77 -56 -147.5t-132 -123.5q-132 -84 -277 -84 q-136 0 -222 78q-90 84 -79 207q7 76 56 147t131 124q132 84 278 84q83 0 151 -31q9 13 22 22l122 73l-122 73q-13 9 -22 22q-68 -31 -151 -31q-146 0 -278 84q-82 53 -131 124t-56 147q-5 59 15.5 113t63.5 93q85 79 222 79q145 0 277 -84q83 -52 132 -123t56 -148 q4 -48 -10 -97q4 -1 12 -5l110 -66l690 387q14 8 31 8q16 0 29 -7l128 -64q30 -16 35 -51q3 -36 -25 -56zM579 836q46 42 21 108t-106 117q-92 59 -192 59q-74 0 -113 -36q-46 -42 -21 -108t106 -117q92 -59 192 -59q74 0 113 36zM494 91q81 51 106 117t-21 108 q-39 36 -113 36q-100 0 -192 -59q-81 -51 -106 -117t21 -108q39 -36 113 -36q100 0 192 59zM672 704l96 -58v11q0 36 33 56l14 8l-79 47l-26 -26q-3 -3 -10 -11t-12 -12q-2 -2 -4 -3.5t-3 -2.5zM896 480l96 -32l736 576l-128 64l-768 -431v-113l-160 -96l9 -8q2 -2 7 -6 q4 -4 11 -12t11 -12l26 -26zM1600 64l128 64l-520 408l-177 -138q-2 -3 -13 -7z" />
+<glyph unicode="&#xf0c5;" horiz-adv-x="1792" d="M1696 1152q40 0 68 -28t28 -68v-1216q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v288h-544q-40 0 -68 28t-28 68v672q0 40 20 88t48 76l408 408q28 28 76 48t88 20h416q40 0 68 -28t28 -68v-328q68 40 128 40h416zM1152 939l-299 -299h299v299zM512 1323l-299 -299 h299v299zM708 676l316 316v416h-384v-416q0 -40 -28 -68t-68 -28h-416v-640h512v256q0 40 20 88t48 76zM1664 -128v1152h-384v-416q0 -40 -28 -68t-68 -28h-416v-640h896z" />
+<glyph unicode="&#xf0c6;" horiz-adv-x="1408" d="M1404 151q0 -117 -79 -196t-196 -79q-135 0 -235 100l-777 776q-113 115 -113 271q0 159 110 270t269 111q158 0 273 -113l605 -606q10 -10 10 -22q0 -16 -30.5 -46.5t-46.5 -30.5q-13 0 -23 10l-606 607q-79 77 -181 77q-106 0 -179 -75t-73 -181q0 -105 76 -181 l776 -777q63 -63 145 -63q64 0 106 42t42 106q0 82 -63 145l-581 581q-26 24 -60 24q-29 0 -48 -19t-19 -48q0 -32 25 -59l410 -410q10 -10 10 -22q0 -16 -31 -47t-47 -31q-12 0 -22 10l-410 410q-63 61 -63 149q0 82 57 139t139 57q88 0 149 -63l581 -581q100 -98 100 -235 z" />
+<glyph unicode="&#xf0c7;" d="M384 0h768v384h-768v-384zM1280 0h128v896q0 14 -10 38.5t-20 34.5l-281 281q-10 10 -34 20t-39 10v-416q0 -40 -28 -68t-68 -28h-576q-40 0 -68 28t-28 68v416h-128v-1280h128v416q0 40 28 68t68 28h832q40 0 68 -28t28 -68v-416zM896 928v320q0 13 -9.5 22.5t-22.5 9.5 h-192q-13 0 -22.5 -9.5t-9.5 -22.5v-320q0 -13 9.5 -22.5t22.5 -9.5h192q13 0 22.5 9.5t9.5 22.5zM1536 896v-928q0 -40 -28 -68t-68 -28h-1344q-40 0 -68 28t-28 68v1344q0 40 28 68t68 28h928q40 0 88 -20t76 -48l280 -280q28 -28 48 -76t20 -88z" />
+<glyph unicode="&#xf0c8;" d="M1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf0c9;" d="M1536 192v-128q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1408q26 0 45 -19t19 -45zM1536 704v-128q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1408q26 0 45 -19t19 -45zM1536 1216v-128q0 -26 -19 -45 t-45 -19h-1408q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h1408q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0ca;" horiz-adv-x="1792" d="M384 128q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM384 640q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM1792 224v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5 t22.5 9.5h1216q13 0 22.5 -9.5t9.5 -22.5zM384 1152q0 -80 -56 -136t-136 -56t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM1792 736v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1216q13 0 22.5 -9.5t9.5 -22.5z M1792 1248v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1216q13 0 22.5 -9.5t9.5 -22.5z" />
+<glyph unicode="&#xf0cb;" horiz-adv-x="1792" d="M381 -84q0 -80 -54.5 -126t-135.5 -46q-106 0 -172 66l57 88q49 -45 106 -45q29 0 50.5 14.5t21.5 42.5q0 64 -105 56l-26 56q8 10 32.5 43.5t42.5 54t37 38.5v1q-16 0 -48.5 -1t-48.5 -1v-53h-106v152h333v-88l-95 -115q51 -12 81 -49t30 -88zM383 543v-159h-362 q-6 36 -6 54q0 51 23.5 93t56.5 68t66 47.5t56.5 43.5t23.5 45q0 25 -14.5 38.5t-39.5 13.5q-46 0 -81 -58l-85 59q24 51 71.5 79.5t105.5 28.5q73 0 123 -41.5t50 -112.5q0 -50 -34 -91.5t-75 -64.5t-75.5 -50.5t-35.5 -52.5h127v60h105zM1792 224v-192q0 -13 -9.5 -22.5 t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 14 9 23t23 9h1216q13 0 22.5 -9.5t9.5 -22.5zM384 1123v-99h-335v99h107q0 41 0.5 122t0.5 121v12h-2q-8 -17 -50 -54l-71 76l136 127h106v-404h108zM1792 736v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5 t-9.5 22.5v192q0 14 9 23t23 9h1216q13 0 22.5 -9.5t9.5 -22.5zM1792 1248v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1216q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1216q13 0 22.5 -9.5t9.5 -22.5z" />
+<glyph unicode="&#xf0cc;" horiz-adv-x="1792" d="M1760 640q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-1728q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h1728zM483 704q-28 35 -51 80q-48 97 -48 188q0 181 134 309q133 127 393 127q50 0 167 -19q66 -12 177 -48q10 -38 21 -118q14 -123 14 -183q0 -18 -5 -45l-12 -3l-84 6 l-14 2q-50 149 -103 205q-88 91 -210 91q-114 0 -182 -59q-67 -58 -67 -146q0 -73 66 -140t279 -129q69 -20 173 -66q58 -28 95 -52h-743zM990 448h411q7 -39 7 -92q0 -111 -41 -212q-23 -55 -71 -104q-37 -35 -109 -81q-80 -48 -153 -66q-80 -21 -203 -21q-114 0 -195 23 l-140 40q-57 16 -72 28q-8 8 -8 22v13q0 108 -2 156q-1 30 0 68l2 37v44l102 2q15 -34 30 -71t22.5 -56t12.5 -27q35 -57 80 -94q43 -36 105 -57q59 -22 132 -22q64 0 139 27q77 26 122 86q47 61 47 129q0 84 -81 157q-34 29 -137 71z" />
+<glyph unicode="&#xf0cd;" d="M48 1313q-37 2 -45 4l-3 88q13 1 40 1q60 0 112 -4q132 -7 166 -7q86 0 168 3q116 4 146 5q56 0 86 2l-1 -14l2 -64v-9q-60 -9 -124 -9q-60 0 -79 -25q-13 -14 -13 -132q0 -13 0.5 -32.5t0.5 -25.5l1 -229l14 -280q6 -124 51 -202q35 -59 96 -92q88 -47 177 -47 q104 0 191 28q56 18 99 51q48 36 65 64q36 56 53 114q21 73 21 229q0 79 -3.5 128t-11 122.5t-13.5 159.5l-4 59q-5 67 -24 88q-34 35 -77 34l-100 -2l-14 3l2 86h84l205 -10q76 -3 196 10l18 -2q6 -38 6 -51q0 -7 -4 -31q-45 -12 -84 -13q-73 -11 -79 -17q-15 -15 -15 -41 q0 -7 1.5 -27t1.5 -31q8 -19 22 -396q6 -195 -15 -304q-15 -76 -41 -122q-38 -65 -112 -123q-75 -57 -182 -89q-109 -33 -255 -33q-167 0 -284 46q-119 47 -179 122q-61 76 -83 195q-16 80 -16 237v333q0 188 -17 213q-25 36 -147 39zM1536 -96v64q0 14 -9 23t-23 9h-1472 q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h1472q14 0 23 9t9 23z" />
+<glyph unicode="&#xf0ce;" horiz-adv-x="1664" d="M512 160v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM512 544v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1024 160v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23 v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM512 928v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1024 544v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1536 160v192 q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1024 928v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1536 544v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192 q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1536 928v192q0 14 -9 23t-23 9h-320q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h320q14 0 23 9t9 23zM1664 1248v-1088q0 -66 -47 -113t-113 -47h-1344q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1344q66 0 113 -47t47 -113 z" />
+<glyph unicode="&#xf0d0;" horiz-adv-x="1664" d="M1190 955l293 293l-107 107l-293 -293zM1637 1248q0 -27 -18 -45l-1286 -1286q-18 -18 -45 -18t-45 18l-198 198q-18 18 -18 45t18 45l1286 1286q18 18 45 18t45 -18l198 -198q18 -18 18 -45zM286 1438l98 -30l-98 -30l-30 -98l-30 98l-98 30l98 30l30 98zM636 1276 l196 -60l-196 -60l-60 -196l-60 196l-196 60l196 60l60 196zM1566 798l98 -30l-98 -30l-30 -98l-30 98l-98 30l98 30l30 98zM926 1438l98 -30l-98 -30l-30 -98l-30 98l-98 30l98 30l30 98z" />
+<glyph unicode="&#xf0d1;" horiz-adv-x="1792" d="M640 128q0 52 -38 90t-90 38t-90 -38t-38 -90t38 -90t90 -38t90 38t38 90zM256 640h384v256h-158q-13 0 -22 -9l-195 -195q-9 -9 -9 -22v-30zM1536 128q0 52 -38 90t-90 38t-90 -38t-38 -90t38 -90t90 -38t90 38t38 90zM1792 1216v-1024q0 -15 -4 -26.5t-13.5 -18.5 t-16.5 -11.5t-23.5 -6t-22.5 -2t-25.5 0t-22.5 0.5q0 -106 -75 -181t-181 -75t-181 75t-75 181h-384q0 -106 -75 -181t-181 -75t-181 75t-75 181h-64q-3 0 -22.5 -0.5t-25.5 0t-22.5 2t-23.5 6t-16.5 11.5t-13.5 18.5t-4 26.5q0 26 19 45t45 19v320q0 8 -0.5 35t0 38 t2.5 34.5t6.5 37t14 30.5t22.5 30l198 198q19 19 50.5 32t58.5 13h160v192q0 26 19 45t45 19h1024q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0d2;" d="M1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103q-111 0 -218 32q59 93 78 164q9 34 54 211q20 -39 73 -67.5t114 -28.5q121 0 216 68.5t147 188.5t52 270q0 114 -59.5 214t-172.5 163t-255 63q-105 0 -196 -29t-154.5 -77t-109 -110.5t-67 -129.5t-21.5 -134 q0 -104 40 -183t117 -111q30 -12 38 20q2 7 8 31t8 30q6 23 -11 43q-51 61 -51 151q0 151 104.5 259.5t273.5 108.5q151 0 235.5 -82t84.5 -213q0 -170 -68.5 -289t-175.5 -119q-61 0 -98 43.5t-23 104.5q8 35 26.5 93.5t30 103t11.5 75.5q0 50 -27 83t-77 33 q-62 0 -105 -57t-43 -142q0 -73 25 -122l-99 -418q-17 -70 -13 -177q-206 91 -333 281t-127 423q0 209 103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf0d3;" d="M1248 1408q119 0 203.5 -84.5t84.5 -203.5v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-725q85 122 108 210q9 34 53 209q21 -39 73.5 -67t112.5 -28q181 0 295.5 147.5t114.5 373.5q0 84 -35 162.5t-96.5 139t-152.5 97t-197 36.5q-104 0 -194.5 -28.5t-153 -76.5 t-107.5 -109.5t-66.5 -128t-21.5 -132.5q0 -102 39.5 -180t116.5 -110q13 -5 23.5 0t14.5 19q10 44 15 61q6 23 -11 42q-50 62 -50 150q0 150 103.5 256.5t270.5 106.5q149 0 232.5 -81t83.5 -210q0 -168 -67.5 -286t-173.5 -118q-60 0 -97 43.5t-23 103.5q8 34 26.5 92.5 t29.5 102t11 74.5q0 49 -26.5 81.5t-75.5 32.5q-61 0 -103.5 -56.5t-42.5 -139.5q0 -72 24 -121l-98 -414q-24 -100 -7 -254h-183q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960z" />
+<glyph unicode="&#xf0d4;" d="M678 -57q0 -38 -10 -71h-380q-95 0 -171.5 56.5t-103.5 147.5q24 45 69 77.5t100 49.5t107 24t107 7q32 0 49 -2q6 -4 30.5 -21t33 -23t31 -23t32 -25.5t27.5 -25.5t26.5 -29.5t21 -30.5t17.5 -34.5t9.5 -36t4.5 -40.5zM385 294q-234 -7 -385 -85v433q103 -118 273 -118 q32 0 70 5q-21 -61 -21 -86q0 -67 63 -149zM558 805q0 -100 -43.5 -160.5t-140.5 -60.5q-51 0 -97 26t-78 67.5t-56 93.5t-35.5 104t-11.5 99q0 96 51.5 165t144.5 69q66 0 119 -41t84 -104t47 -130t16 -128zM1536 896v-736q0 -119 -84.5 -203.5t-203.5 -84.5h-468 q39 73 39 157q0 66 -22 122.5t-55.5 93t-72 71t-72 59.5t-55.5 54.5t-22 59.5q0 36 23 68t56 61.5t65.5 64.5t55.5 93t23 131t-26.5 145.5t-75.5 118.5q-6 6 -14 11t-12.5 7.5t-10 9.5t-10.5 17h135l135 64h-437q-138 0 -244.5 -38.5t-182.5 -133.5q0 126 81 213t207 87h960 q119 0 203.5 -84.5t84.5 -203.5v-96h-256v256h-128v-256h-256v-128h256v-256h128v256h256z" />
+<glyph unicode="&#xf0d5;" horiz-adv-x="1664" d="M876 71q0 21 -4.5 40.5t-9.5 36t-17.5 34.5t-21 30.5t-26.5 29.5t-27.5 25.5t-32 25.5t-31 23t-33 23t-30.5 21q-17 2 -50 2q-54 0 -106 -7t-108 -25t-98 -46t-69 -75t-27 -107q0 -68 35.5 -121.5t93 -84t120.5 -45.5t127 -15q59 0 112.5 12.5t100.5 39t74.5 73.5 t27.5 110zM756 933q0 60 -16.5 127.5t-47 130.5t-84 104t-119.5 41q-93 0 -144 -69t-51 -165q0 -47 11.5 -99t35.5 -104t56 -93.5t78 -67.5t97 -26q97 0 140.5 60.5t43.5 160.5zM625 1408h437l-135 -79h-135q71 -45 110 -126t39 -169q0 -74 -23 -131.5t-56 -92.5t-66 -64.5 t-56 -61t-23 -67.5q0 -26 16.5 -51t43 -48t58.5 -48t64 -55.5t58.5 -66t43 -85t16.5 -106.5q0 -160 -140 -282q-152 -131 -420 -131q-59 0 -119.5 10t-122 33.5t-108.5 58t-77 89t-30 121.5q0 61 37 135q32 64 96 110.5t145 71t155 36t150 13.5q-64 83 -64 149q0 12 2 23.5 t5 19.5t8 21.5t7 21.5q-40 -5 -70 -5q-149 0 -255.5 98t-106.5 246q0 140 95 250.5t234 141.5q94 20 187 20zM1664 1152v-128h-256v-256h-128v256h-256v128h256v256h128v-256h256z" />
+<glyph unicode="&#xf0d6;" horiz-adv-x="1920" d="M768 384h384v96h-128v448h-114l-148 -137l77 -80q42 37 55 57h2v-288h-128v-96zM1280 640q0 -70 -21 -142t-59.5 -134t-101.5 -101t-138 -39t-138 39t-101.5 101t-59.5 134t-21 142t21 142t59.5 134t101.5 101t138 39t138 -39t101.5 -101t59.5 -134t21 -142zM1792 384 v512q-106 0 -181 75t-75 181h-1152q0 -106 -75 -181t-181 -75v-512q106 0 181 -75t75 -181h1152q0 106 75 181t181 75zM1920 1216v-1152q0 -26 -19 -45t-45 -19h-1792q-26 0 -45 19t-19 45v1152q0 26 19 45t45 19h1792q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0d7;" horiz-adv-x="1024" d="M1024 832q0 -26 -19 -45l-448 -448q-19 -19 -45 -19t-45 19l-448 448q-19 19 -19 45t19 45t45 19h896q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0d8;" horiz-adv-x="1024" d="M1024 320q0 -26 -19 -45t-45 -19h-896q-26 0 -45 19t-19 45t19 45l448 448q19 19 45 19t45 -19l448 -448q19 -19 19 -45z" />
+<glyph unicode="&#xf0d9;" horiz-adv-x="640" d="M640 1088v-896q0 -26 -19 -45t-45 -19t-45 19l-448 448q-19 19 -19 45t19 45l448 448q19 19 45 19t45 -19t19 -45z" />
+<glyph unicode="&#xf0da;" horiz-adv-x="640" d="M576 640q0 -26 -19 -45l-448 -448q-19 -19 -45 -19t-45 19t-19 45v896q0 26 19 45t45 19t45 -19l448 -448q19 -19 19 -45z" />
+<glyph unicode="&#xf0db;" horiz-adv-x="1664" d="M160 0h608v1152h-640v-1120q0 -13 9.5 -22.5t22.5 -9.5zM1536 32v1120h-640v-1152h608q13 0 22.5 9.5t9.5 22.5zM1664 1248v-1216q0 -66 -47 -113t-113 -47h-1344q-66 0 -113 47t-47 113v1216q0 66 47 113t113 47h1344q66 0 113 -47t47 -113z" />
+<glyph unicode="&#xf0dc;" horiz-adv-x="1024" d="M1024 448q0 -26 -19 -45l-448 -448q-19 -19 -45 -19t-45 19l-448 448q-19 19 -19 45t19 45t45 19h896q26 0 45 -19t19 -45zM1024 832q0 -26 -19 -45t-45 -19h-896q-26 0 -45 19t-19 45t19 45l448 448q19 19 45 19t45 -19l448 -448q19 -19 19 -45z" />
+<glyph unicode="&#xf0dd;" horiz-adv-x="1024" d="M1024 448q0 -26 -19 -45l-448 -448q-19 -19 -45 -19t-45 19l-448 448q-19 19 -19 45t19 45t45 19h896q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0de;" horiz-adv-x="1024" d="M1024 832q0 -26 -19 -45t-45 -19h-896q-26 0 -45 19t-19 45t19 45l448 448q19 19 45 19t45 -19l448 -448q19 -19 19 -45z" />
+<glyph unicode="&#xf0e0;" horiz-adv-x="1792" d="M1792 826v-794q0 -66 -47 -113t-113 -47h-1472q-66 0 -113 47t-47 113v794q44 -49 101 -87q362 -246 497 -345q57 -42 92.5 -65.5t94.5 -48t110 -24.5h1h1q51 0 110 24.5t94.5 48t92.5 65.5q170 123 498 345q57 39 100 87zM1792 1120q0 -79 -49 -151t-122 -123 q-376 -261 -468 -325q-10 -7 -42.5 -30.5t-54 -38t-52 -32.5t-57.5 -27t-50 -9h-1h-1q-23 0 -50 9t-57.5 27t-52 32.5t-54 38t-42.5 30.5q-91 64 -262 182.5t-205 142.5q-62 42 -117 115.5t-55 136.5q0 78 41.5 130t118.5 52h1472q65 0 112.5 -47t47.5 -113z" />
+<glyph unicode="&#xf0e1;" d="M349 911v-991h-330v991h330zM370 1217q1 -73 -50.5 -122t-135.5 -49h-2q-82 0 -132 49t-50 122q0 74 51.5 122.5t134.5 48.5t133 -48.5t51 -122.5zM1536 488v-568h-329v530q0 105 -40.5 164.5t-126.5 59.5q-63 0 -105.5 -34.5t-63.5 -85.5q-11 -30 -11 -81v-553h-329 q2 399 2 647t-1 296l-1 48h329v-144h-2q20 32 41 56t56.5 52t87 43.5t114.5 15.5q171 0 275 -113.5t104 -332.5z" />
+<glyph unicode="&#xf0e2;" d="M1536 640q0 -156 -61 -298t-164 -245t-245 -164t-298 -61q-172 0 -327 72.5t-264 204.5q-7 10 -6.5 22.5t8.5 20.5l137 138q10 9 25 9q16 -2 23 -12q73 -95 179 -147t225 -52q104 0 198.5 40.5t163.5 109.5t109.5 163.5t40.5 198.5t-40.5 198.5t-109.5 163.5 t-163.5 109.5t-198.5 40.5q-98 0 -188 -35.5t-160 -101.5l137 -138q31 -30 14 -69q-17 -40 -59 -40h-448q-26 0 -45 19t-19 45v448q0 42 40 59q39 17 69 -14l130 -129q107 101 244.5 156.5t284.5 55.5q156 0 298 -61t245 -164t164 -245t61 -298z" />
+<glyph unicode="&#xf0e3;" horiz-adv-x="1792" d="M1771 0q0 -53 -37 -90l-107 -108q-39 -37 -91 -37q-53 0 -90 37l-363 364q-38 36 -38 90q0 53 43 96l-256 256l-126 -126q-14 -14 -34 -14t-34 14q2 -2 12.5 -12t12.5 -13t10 -11.5t10 -13.5t6 -13.5t5.5 -16.5t1.5 -18q0 -38 -28 -68q-3 -3 -16.5 -18t-19 -20.5 t-18.5 -16.5t-22 -15.5t-22 -9t-26 -4.5q-40 0 -68 28l-408 408q-28 28 -28 68q0 13 4.5 26t9 22t15.5 22t16.5 18.5t20.5 19t18 16.5q30 28 68 28q10 0 18 -1.5t16.5 -5.5t13.5 -6t13.5 -10t11.5 -10t13 -12.5t12 -12.5q-14 14 -14 34t14 34l348 348q14 14 34 14t34 -14 q-2 2 -12.5 12t-12.5 13t-10 11.5t-10 13.5t-6 13.5t-5.5 16.5t-1.5 18q0 38 28 68q3 3 16.5 18t19 20.5t18.5 16.5t22 15.5t22 9t26 4.5q40 0 68 -28l408 -408q28 -28 28 -68q0 -13 -4.5 -26t-9 -22t-15.5 -22t-16.5 -18.5t-20.5 -19t-18 -16.5q-30 -28 -68 -28 q-10 0 -18 1.5t-16.5 5.5t-13.5 6t-13.5 10t-11.5 10t-13 12.5t-12 12.5q14 -14 14 -34t-14 -34l-126 -126l256 -256q43 43 96 43q52 0 91 -37l363 -363q37 -39 37 -91z" />
+<glyph unicode="&#xf0e4;" horiz-adv-x="1792" d="M384 384q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM576 832q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1004 351l101 382q6 26 -7.5 48.5t-38.5 29.5 t-48 -6.5t-30 -39.5l-101 -382q-60 -5 -107 -43.5t-63 -98.5q-20 -77 20 -146t117 -89t146 20t89 117q16 60 -6 117t-72 91zM1664 384q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1024 1024q0 53 -37.5 90.5 t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1472 832q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1792 384q0 -261 -141 -483q-19 -29 -54 -29h-1402q-35 0 -54 29 q-141 221 -141 483q0 182 71 348t191 286t286 191t348 71t348 -71t286 -191t191 -286t71 -348z" />
+<glyph unicode="&#xf0e5;" horiz-adv-x="1792" d="M896 1152q-204 0 -381.5 -69.5t-282 -187.5t-104.5 -255q0 -112 71.5 -213.5t201.5 -175.5l87 -50l-27 -96q-24 -91 -70 -172q152 63 275 171l43 38l57 -6q69 -8 130 -8q204 0 381.5 69.5t282 187.5t104.5 255t-104.5 255t-282 187.5t-381.5 69.5zM1792 640 q0 -174 -120 -321.5t-326 -233t-450 -85.5q-70 0 -145 8q-198 -175 -460 -242q-49 -14 -114 -22h-5q-15 0 -27 10.5t-16 27.5v1q-3 4 -0.5 12t2 10t4.5 9.5l6 9t7 8.5t8 9q7 8 31 34.5t34.5 38t31 39.5t32.5 51t27 59t26 76q-157 89 -247.5 220t-90.5 281q0 174 120 321.5 t326 233t450 85.5t450 -85.5t326 -233t120 -321.5z" />
+<glyph unicode="&#xf0e6;" horiz-adv-x="1792" d="M704 1152q-153 0 -286 -52t-211.5 -141t-78.5 -191q0 -82 53 -158t149 -132l97 -56l-35 -84q34 20 62 39l44 31l53 -10q78 -14 153 -14q153 0 286 52t211.5 141t78.5 191t-78.5 191t-211.5 141t-286 52zM704 1280q191 0 353.5 -68.5t256.5 -186.5t94 -257t-94 -257 t-256.5 -186.5t-353.5 -68.5q-86 0 -176 16q-124 -88 -278 -128q-36 -9 -86 -16h-3q-11 0 -20.5 8t-11.5 21q-1 3 -1 6.5t0.5 6.5t2 6l2.5 5t3.5 5.5t4 5t4.5 5t4 4.5q5 6 23 25t26 29.5t22.5 29t25 38.5t20.5 44q-124 72 -195 177t-71 224q0 139 94 257t256.5 186.5 t353.5 68.5zM1526 111q10 -24 20.5 -44t25 -38.5t22.5 -29t26 -29.5t23 -25q1 -1 4 -4.5t4.5 -5t4 -5t3.5 -5.5l2.5 -5t2 -6t0.5 -6.5t-1 -6.5q-3 -14 -13 -22t-22 -7q-50 7 -86 16q-154 40 -278 128q-90 -16 -176 -16q-271 0 -472 132q58 -4 88 -4q161 0 309 45t264 129 q125 92 192 212t67 254q0 77 -23 152q129 -71 204 -178t75 -230q0 -120 -71 -224.5t-195 -176.5z" />
+<glyph unicode="&#xf0e7;" horiz-adv-x="896" d="M885 970q18 -20 7 -44l-540 -1157q-13 -25 -42 -25q-4 0 -14 2q-17 5 -25.5 19t-4.5 30l197 808l-406 -101q-4 -1 -12 -1q-18 0 -31 11q-18 15 -13 39l201 825q4 14 16 23t28 9h328q19 0 32 -12.5t13 -29.5q0 -8 -5 -18l-171 -463l396 98q8 2 12 2q19 0 34 -15z" />
+<glyph unicode="&#xf0e8;" horiz-adv-x="1792" d="M1792 288v-320q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h96v192h-512v-192h96q40 0 68 -28t28 -68v-320q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h96v192h-512v-192h96q40 0 68 -28t28 -68v-320 q0 -40 -28 -68t-68 -28h-320q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h96v192q0 52 38 90t90 38h512v192h-96q-40 0 -68 28t-28 68v320q0 40 28 68t68 28h320q40 0 68 -28t28 -68v-320q0 -40 -28 -68t-68 -28h-96v-192h512q52 0 90 -38t38 -90v-192h96q40 0 68 -28t28 -68 z" />
+<glyph unicode="&#xf0e9;" horiz-adv-x="1664" d="M896 708v-580q0 -104 -76 -180t-180 -76t-180 76t-76 180q0 26 19 45t45 19t45 -19t19 -45q0 -50 39 -89t89 -39t89 39t39 89v580q33 11 64 11t64 -11zM1664 681q0 -13 -9.5 -22.5t-22.5 -9.5q-11 0 -23 10q-49 46 -93 69t-102 23q-68 0 -128 -37t-103 -97 q-7 -10 -17.5 -28t-14.5 -24q-11 -17 -28 -17q-18 0 -29 17q-4 6 -14.5 24t-17.5 28q-43 60 -102.5 97t-127.5 37t-127.5 -37t-102.5 -97q-7 -10 -17.5 -28t-14.5 -24q-11 -17 -29 -17q-17 0 -28 17q-4 6 -14.5 24t-17.5 28q-43 60 -103 97t-128 37q-58 0 -102 -23t-93 -69 q-12 -10 -23 -10q-13 0 -22.5 9.5t-9.5 22.5q0 5 1 7q45 183 172.5 319.5t298 204.5t360.5 68q140 0 274.5 -40t246.5 -113.5t194.5 -187t115.5 -251.5q1 -2 1 -7zM896 1408v-98q-42 2 -64 2t-64 -2v98q0 26 19 45t45 19t45 -19t19 -45z" />
+<glyph unicode="&#xf0ea;" horiz-adv-x="1792" d="M768 -128h896v640h-416q-40 0 -68 28t-28 68v416h-384v-1152zM1024 1312v64q0 13 -9.5 22.5t-22.5 9.5h-704q-13 0 -22.5 -9.5t-9.5 -22.5v-64q0 -13 9.5 -22.5t22.5 -9.5h704q13 0 22.5 9.5t9.5 22.5zM1280 640h299l-299 299v-299zM1792 512v-672q0 -40 -28 -68t-68 -28 h-960q-40 0 -68 28t-28 68v160h-544q-40 0 -68 28t-28 68v1344q0 40 28 68t68 28h1088q40 0 68 -28t28 -68v-328q21 -13 36 -28l408 -408q28 -28 48 -76t20 -88z" />
+<glyph unicode="&#xf0eb;" horiz-adv-x="1024" d="M736 960q0 -13 -9.5 -22.5t-22.5 -9.5t-22.5 9.5t-9.5 22.5q0 46 -54 71t-106 25q-13 0 -22.5 9.5t-9.5 22.5t9.5 22.5t22.5 9.5q50 0 99.5 -16t87 -54t37.5 -90zM896 960q0 72 -34.5 134t-90 101.5t-123 62t-136.5 22.5t-136.5 -22.5t-123 -62t-90 -101.5t-34.5 -134 q0 -101 68 -180q10 -11 30.5 -33t30.5 -33q128 -153 141 -298h228q13 145 141 298q10 11 30.5 33t30.5 33q68 79 68 180zM1024 960q0 -155 -103 -268q-45 -49 -74.5 -87t-59.5 -95.5t-34 -107.5q47 -28 47 -82q0 -37 -25 -64q25 -27 25 -64q0 -52 -45 -81q13 -23 13 -47 q0 -46 -31.5 -71t-77.5 -25q-20 -44 -60 -70t-87 -26t-87 26t-60 70q-46 0 -77.5 25t-31.5 71q0 24 13 47q-45 29 -45 81q0 37 25 64q-25 27 -25 64q0 54 47 82q-4 50 -34 107.5t-59.5 95.5t-74.5 87q-103 113 -103 268q0 99 44.5 184.5t117 142t164 89t186.5 32.5 t186.5 -32.5t164 -89t117 -142t44.5 -184.5z" />
+<glyph unicode="&#xf0ec;" horiz-adv-x="1792" d="M1792 352v-192q0 -13 -9.5 -22.5t-22.5 -9.5h-1376v-192q0 -13 -9.5 -22.5t-22.5 -9.5q-12 0 -24 10l-319 320q-9 9 -9 22q0 14 9 23l320 320q9 9 23 9q13 0 22.5 -9.5t9.5 -22.5v-192h1376q13 0 22.5 -9.5t9.5 -22.5zM1792 896q0 -14 -9 -23l-320 -320q-9 -9 -23 -9 q-13 0 -22.5 9.5t-9.5 22.5v192h-1376q-13 0 -22.5 9.5t-9.5 22.5v192q0 13 9.5 22.5t22.5 9.5h1376v192q0 14 9 23t23 9q12 0 24 -10l319 -319q9 -9 9 -23z" />
+<glyph unicode="&#xf0ed;" horiz-adv-x="1920" d="M1280 608q0 14 -9 23t-23 9h-224v352q0 13 -9.5 22.5t-22.5 9.5h-192q-13 0 -22.5 -9.5t-9.5 -22.5v-352h-224q-13 0 -22.5 -9.5t-9.5 -22.5q0 -14 9 -23l352 -352q9 -9 23 -9t23 9l351 351q10 12 10 24zM1920 384q0 -159 -112.5 -271.5t-271.5 -112.5h-1088 q-185 0 -316.5 131.5t-131.5 316.5q0 130 70 240t188 165q-2 30 -2 43q0 212 150 362t362 150q156 0 285.5 -87t188.5 -231q71 62 166 62q106 0 181 -75t75 -181q0 -76 -41 -138q130 -31 213.5 -135.5t83.5 -238.5z" />
+<glyph unicode="&#xf0ee;" horiz-adv-x="1920" d="M1280 672q0 14 -9 23l-352 352q-9 9 -23 9t-23 -9l-351 -351q-10 -12 -10 -24q0 -14 9 -23t23 -9h224v-352q0 -13 9.5 -22.5t22.5 -9.5h192q13 0 22.5 9.5t9.5 22.5v352h224q13 0 22.5 9.5t9.5 22.5zM1920 384q0 -159 -112.5 -271.5t-271.5 -112.5h-1088 q-185 0 -316.5 131.5t-131.5 316.5q0 130 70 240t188 165q-2 30 -2 43q0 212 150 362t362 150q156 0 285.5 -87t188.5 -231q71 62 166 62q106 0 181 -75t75 -181q0 -76 -41 -138q130 -31 213.5 -135.5t83.5 -238.5z" />
+<glyph unicode="&#xf0f0;" horiz-adv-x="1408" d="M384 192q0 -26 -19 -45t-45 -19t-45 19t-19 45t19 45t45 19t45 -19t19 -45zM1408 131q0 -121 -73 -190t-194 -69h-874q-121 0 -194 69t-73 190q0 68 5.5 131t24 138t47.5 132.5t81 103t120 60.5q-22 -52 -22 -120v-203q-58 -20 -93 -70t-35 -111q0 -80 56 -136t136 -56 t136 56t56 136q0 61 -35.5 111t-92.5 70v203q0 62 25 93q132 -104 295 -104t295 104q25 -31 25 -93v-64q-106 0 -181 -75t-75 -181v-89q-32 -29 -32 -71q0 -40 28 -68t68 -28t68 28t28 68q0 42 -32 71v89q0 52 38 90t90 38t90 -38t38 -90v-89q-32 -29 -32 -71q0 -40 28 -68 t68 -28t68 28t28 68q0 42 -32 71v89q0 68 -34.5 127.5t-93.5 93.5q0 10 0.5 42.5t0 48t-2.5 41.5t-7 47t-13 40q68 -15 120 -60.5t81 -103t47.5 -132.5t24 -138t5.5 -131zM1088 1024q0 -159 -112.5 -271.5t-271.5 -112.5t-271.5 112.5t-112.5 271.5t112.5 271.5t271.5 112.5 t271.5 -112.5t112.5 -271.5z" />
+<glyph unicode="&#xf0f1;" horiz-adv-x="1408" d="M1280 832q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 832q0 -62 -35.5 -111t-92.5 -70v-395q0 -159 -131.5 -271.5t-316.5 -112.5t-316.5 112.5t-131.5 271.5v132q-164 20 -274 128t-110 252v512q0 26 19 45t45 19q6 0 16 -2q17 30 47 48 t65 18q53 0 90.5 -37.5t37.5 -90.5t-37.5 -90.5t-90.5 -37.5q-33 0 -64 18v-402q0 -106 94 -181t226 -75t226 75t94 181v402q-31 -18 -64 -18q-53 0 -90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5q35 0 65 -18t47 -48q10 2 16 2q26 0 45 -19t19 -45v-512q0 -144 -110 -252 t-274 -128v-132q0 -106 94 -181t226 -75t226 75t94 181v395q-57 21 -92.5 70t-35.5 111q0 80 56 136t136 56t136 -56t56 -136z" />
+<glyph unicode="&#xf0f2;" horiz-adv-x="1792" d="M640 1152h512v128h-512v-128zM288 1152v-1280h-64q-92 0 -158 66t-66 158v832q0 92 66 158t158 66h64zM1408 1152v-1280h-1024v1280h128v160q0 40 28 68t68 28h576q40 0 68 -28t28 -68v-160h128zM1792 928v-832q0 -92 -66 -158t-158 -66h-64v1280h64q92 0 158 -66 t66 -158z" />
+<glyph unicode="&#xf0f3;" horiz-adv-x="1664" d="M848 -160q0 16 -16 16q-59 0 -101.5 42.5t-42.5 101.5q0 16 -16 16t-16 -16q0 -73 51.5 -124.5t124.5 -51.5q16 0 16 16zM1664 128q0 -52 -38 -90t-90 -38h-448q0 -106 -75 -181t-181 -75t-181 75t-75 181h-448q-52 0 -90 38t-38 90q190 161 287 397.5t97 498.5 q0 165 96 262t264 117q-8 18 -8 37q0 40 28 68t68 28t68 -28t28 -68q0 -19 -8 -37q168 -20 264 -117t96 -262q0 -262 97 -498.5t287 -397.5z" />
+<glyph unicode="&#xf0f4;" horiz-adv-x="1920" d="M1664 896q0 80 -56 136t-136 56h-64v-384h64q80 0 136 56t56 136zM0 128h1792q0 -106 -75 -181t-181 -75h-1280q-106 0 -181 75t-75 181zM1856 896q0 -159 -112.5 -271.5t-271.5 -112.5h-64v-32q0 -92 -66 -158t-158 -66h-704q-92 0 -158 66t-66 158v736q0 26 19 45 t45 19h1152q159 0 271.5 -112.5t112.5 -271.5z" />
+<glyph unicode="&#xf0f5;" horiz-adv-x="1408" d="M640 1472v-640q0 -61 -35.5 -111t-92.5 -70v-779q0 -52 -38 -90t-90 -38h-128q-52 0 -90 38t-38 90v779q-57 20 -92.5 70t-35.5 111v640q0 26 19 45t45 19t45 -19t19 -45v-416q0 -26 19 -45t45 -19t45 19t19 45v416q0 26 19 45t45 19t45 -19t19 -45v-416q0 -26 19 -45 t45 -19t45 19t19 45v416q0 26 19 45t45 19t45 -19t19 -45zM1408 1472v-1600q0 -52 -38 -90t-90 -38h-128q-52 0 -90 38t-38 90v512h-224q-13 0 -22.5 9.5t-9.5 22.5v800q0 132 94 226t226 94h256q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0f6;" horiz-adv-x="1280" d="M1024 352v-64q0 -14 -9 -23t-23 -9h-704q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h704q14 0 23 -9t9 -23zM1024 608v-64q0 -14 -9 -23t-23 -9h-704q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h704q14 0 23 -9t9 -23zM128 0h1024v768h-416q-40 0 -68 28t-28 68v416h-512v-1280z M768 896h376q-10 29 -22 41l-313 313q-12 12 -41 22v-376zM1280 864v-896q0 -40 -28 -68t-68 -28h-1088q-40 0 -68 28t-28 68v1344q0 40 28 68t68 28h640q40 0 88 -20t76 -48l312 -312q28 -28 48 -76t20 -88z" />
+<glyph unicode="&#xf0f7;" horiz-adv-x="1408" d="M384 224v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M1152 224v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM896 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 992v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M1152 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM896 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 992v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 1248v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M1152 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM896 992v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 1248v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1152 992v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M896 1248v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1152 1248v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M896 -128h384v1536h-1152v-1536h384v224q0 13 9.5 22.5t22.5 9.5h320q13 0 22.5 -9.5t9.5 -22.5v-224zM1408 1472v-1664q0 -26 -19 -45t-45 -19h-1280q-26 0 -45 19t-19 45v1664q0 26 19 45t45 19h1280q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0f8;" horiz-adv-x="1408" d="M384 224v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM384 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M1152 224v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM896 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M640 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1152 480v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M896 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5zM1152 736v-64q0 -13 -9.5 -22.5t-22.5 -9.5h-64q-13 0 -22.5 9.5t-9.5 22.5v64q0 13 9.5 22.5t22.5 9.5h64q13 0 22.5 -9.5t9.5 -22.5z M896 -128h384v1152h-256v-32q0 -40 -28 -68t-68 -28h-448q-40 0 -68 28t-28 68v32h-256v-1152h384v224q0 13 9.5 22.5t22.5 9.5h320q13 0 22.5 -9.5t9.5 -22.5v-224zM896 1056v320q0 13 -9.5 22.5t-22.5 9.5h-64q-13 0 -22.5 -9.5t-9.5 -22.5v-96h-128v96q0 13 -9.5 22.5 t-22.5 9.5h-64q-13 0 -22.5 -9.5t-9.5 -22.5v-320q0 -13 9.5 -22.5t22.5 -9.5h64q13 0 22.5 9.5t9.5 22.5v96h128v-96q0 -13 9.5 -22.5t22.5 -9.5h64q13 0 22.5 9.5t9.5 22.5zM1408 1088v-1280q0 -26 -19 -45t-45 -19h-1280q-26 0 -45 19t-19 45v1280q0 26 19 45t45 19h320 v288q0 40 28 68t68 28h448q40 0 68 -28t28 -68v-288h320q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0f9;" horiz-adv-x="1920" d="M640 128q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM256 640h384v256h-158q-14 -2 -22 -9l-195 -195q-7 -12 -9 -22v-30zM1536 128q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5 t90.5 37.5t37.5 90.5zM1664 800v192q0 14 -9 23t-23 9h-224v224q0 14 -9 23t-23 9h-192q-14 0 -23 -9t-9 -23v-224h-224q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h224v-224q0 -14 9 -23t23 -9h192q14 0 23 9t9 23v224h224q14 0 23 9t9 23zM1920 1344v-1152 q0 -26 -19 -45t-45 -19h-192q0 -106 -75 -181t-181 -75t-181 75t-75 181h-384q0 -106 -75 -181t-181 -75t-181 75t-75 181h-128q-26 0 -45 19t-19 45t19 45t45 19v416q0 26 13 58t32 51l198 198q19 19 51 32t58 13h160v320q0 26 19 45t45 19h1152q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf0fa;" horiz-adv-x="1792" d="M1280 416v192q0 14 -9 23t-23 9h-224v224q0 14 -9 23t-23 9h-192q-14 0 -23 -9t-9 -23v-224h-224q-14 0 -23 -9t-9 -23v-192q0 -14 9 -23t23 -9h224v-224q0 -14 9 -23t23 -9h192q14 0 23 9t9 23v224h224q14 0 23 9t9 23zM640 1152h512v128h-512v-128zM256 1152v-1280h-32 q-92 0 -158 66t-66 158v832q0 92 66 158t158 66h32zM1440 1152v-1280h-1088v1280h160v160q0 40 28 68t68 28h576q40 0 68 -28t28 -68v-160h160zM1792 928v-832q0 -92 -66 -158t-158 -66h-32v1280h32q92 0 158 -66t66 -158z" />
+<glyph unicode="&#xf0fb;" horiz-adv-x="1920" d="M1920 576q-1 -32 -288 -96l-352 -32l-224 -64h-64l-293 -352h69q26 0 45 -4.5t19 -11.5t-19 -11.5t-45 -4.5h-96h-160h-64v32h64v416h-160l-192 -224h-96l-32 32v192h32v32h128v8l-192 24v128l192 24v8h-128v32h-32v192l32 32h96l192 -224h160v416h-64v32h64h160h96 q26 0 45 -4.5t19 -11.5t-19 -11.5t-45 -4.5h-69l293 -352h64l224 -64l352 -32q261 -58 287 -93z" />
+<glyph unicode="&#xf0fc;" horiz-adv-x="1664" d="M640 640v384h-256v-256q0 -53 37.5 -90.5t90.5 -37.5h128zM1664 192v-192h-1152v192l128 192h-128q-159 0 -271.5 112.5t-112.5 271.5v320l-64 64l32 128h480l32 128h960l32 -192l-64 -32v-800z" />
+<glyph unicode="&#xf0fd;" d="M1280 192v896q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-320h-512v320q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-896q0 -26 19 -45t45 -19h128q26 0 45 19t19 45v320h512v-320q0 -26 19 -45t45 -19h128q26 0 45 19t19 45zM1536 1120v-960 q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf0fe;" d="M1280 576v128q0 26 -19 45t-45 19h-320v320q0 26 -19 45t-45 19h-128q-26 0 -45 -19t-19 -45v-320h-320q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h320v-320q0 -26 19 -45t45 -19h128q26 0 45 19t19 45v320h320q26 0 45 19t19 45zM1536 1120v-960 q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf100;" horiz-adv-x="1024" d="M627 160q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l50 -50q10 -10 10 -23t-10 -23l-393 -393l393 -393q10 -10 10 -23zM1011 160q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23 t10 23l466 466q10 10 23 10t23 -10l50 -50q10 -10 10 -23t-10 -23l-393 -393l393 -393q10 -10 10 -23z" />
+<glyph unicode="&#xf101;" horiz-adv-x="1024" d="M595 576q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23zM979 576q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23 l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23z" />
+<glyph unicode="&#xf102;" horiz-adv-x="1152" d="M1075 224q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-393 393l-393 -393q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l466 -466q10 -10 10 -23zM1075 608q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-393 393l-393 -393 q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l466 -466q10 -10 10 -23z" />
+<glyph unicode="&#xf103;" horiz-adv-x="1152" d="M1075 672q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l393 -393l393 393q10 10 23 10t23 -10l50 -50q10 -10 10 -23zM1075 1056q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23 t10 23l50 50q10 10 23 10t23 -10l393 -393l393 393q10 10 23 10t23 -10l50 -50q10 -10 10 -23z" />
+<glyph unicode="&#xf104;" horiz-adv-x="640" d="M627 992q0 -13 -10 -23l-393 -393l393 -393q10 -10 10 -23t-10 -23l-50 -50q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l50 -50q10 -10 10 -23z" />
+<glyph unicode="&#xf105;" horiz-adv-x="640" d="M595 576q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23z" />
+<glyph unicode="&#xf106;" horiz-adv-x="1152" d="M1075 352q0 -13 -10 -23l-50 -50q-10 -10 -23 -10t-23 10l-393 393l-393 -393q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l466 -466q10 -10 10 -23z" />
+<glyph unicode="&#xf107;" horiz-adv-x="1152" d="M1075 800q0 -13 -10 -23l-466 -466q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l393 -393l393 393q10 10 23 10t23 -10l50 -50q10 -10 10 -23z" />
+<glyph unicode="&#xf108;" horiz-adv-x="1920" d="M1792 544v832q0 13 -9.5 22.5t-22.5 9.5h-1600q-13 0 -22.5 -9.5t-9.5 -22.5v-832q0 -13 9.5 -22.5t22.5 -9.5h1600q13 0 22.5 9.5t9.5 22.5zM1920 1376v-1088q0 -66 -47 -113t-113 -47h-544q0 -37 16 -77.5t32 -71t16 -43.5q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19 t-19 45q0 14 16 44t32 70t16 78h-544q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h1600q66 0 113 -47t47 -113z" />
+<glyph unicode="&#xf109;" horiz-adv-x="1920" d="M416 256q-66 0 -113 47t-47 113v704q0 66 47 113t113 47h1088q66 0 113 -47t47 -113v-704q0 -66 -47 -113t-113 -47h-1088zM384 1120v-704q0 -13 9.5 -22.5t22.5 -9.5h1088q13 0 22.5 9.5t9.5 22.5v704q0 13 -9.5 22.5t-22.5 9.5h-1088q-13 0 -22.5 -9.5t-9.5 -22.5z M1760 192h160v-96q0 -40 -47 -68t-113 -28h-1600q-66 0 -113 28t-47 68v96h160h1600zM1040 96q16 0 16 16t-16 16h-160q-16 0 -16 -16t16 -16h160z" />
+<glyph unicode="&#xf10a;" horiz-adv-x="1152" d="M640 128q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1024 288v960q0 13 -9.5 22.5t-22.5 9.5h-832q-13 0 -22.5 -9.5t-9.5 -22.5v-960q0 -13 9.5 -22.5t22.5 -9.5h832q13 0 22.5 9.5t9.5 22.5zM1152 1248v-1088q0 -66 -47 -113t-113 -47h-832 q-66 0 -113 47t-47 113v1088q0 66 47 113t113 47h832q66 0 113 -47t47 -113z" />
+<glyph unicode="&#xf10b;" horiz-adv-x="768" d="M464 128q0 33 -23.5 56.5t-56.5 23.5t-56.5 -23.5t-23.5 -56.5t23.5 -56.5t56.5 -23.5t56.5 23.5t23.5 56.5zM672 288v704q0 13 -9.5 22.5t-22.5 9.5h-512q-13 0 -22.5 -9.5t-9.5 -22.5v-704q0 -13 9.5 -22.5t22.5 -9.5h512q13 0 22.5 9.5t9.5 22.5zM480 1136 q0 16 -16 16h-160q-16 0 -16 -16t16 -16h160q16 0 16 16zM768 1152v-1024q0 -52 -38 -90t-90 -38h-512q-52 0 -90 38t-38 90v1024q0 52 38 90t90 38h512q52 0 90 -38t38 -90z" />
+<glyph unicode="&#xf10c;" d="M768 1184q-148 0 -273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273t-73 273t-198 198t-273 73zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103 t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf10d;" horiz-adv-x="1664" d="M768 576v-384q0 -80 -56 -136t-136 -56h-384q-80 0 -136 56t-56 136v704q0 104 40.5 198.5t109.5 163.5t163.5 109.5t198.5 40.5h64q26 0 45 -19t19 -45v-128q0 -26 -19 -45t-45 -19h-64q-106 0 -181 -75t-75 -181v-32q0 -40 28 -68t68 -28h224q80 0 136 -56t56 -136z M1664 576v-384q0 -80 -56 -136t-136 -56h-384q-80 0 -136 56t-56 136v704q0 104 40.5 198.5t109.5 163.5t163.5 109.5t198.5 40.5h64q26 0 45 -19t19 -45v-128q0 -26 -19 -45t-45 -19h-64q-106 0 -181 -75t-75 -181v-32q0 -40 28 -68t68 -28h224q80 0 136 -56t56 -136z" />
+<glyph unicode="&#xf10e;" horiz-adv-x="1664" d="M768 1216v-704q0 -104 -40.5 -198.5t-109.5 -163.5t-163.5 -109.5t-198.5 -40.5h-64q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h64q106 0 181 75t75 181v32q0 40 -28 68t-68 28h-224q-80 0 -136 56t-56 136v384q0 80 56 136t136 56h384q80 0 136 -56t56 -136zM1664 1216 v-704q0 -104 -40.5 -198.5t-109.5 -163.5t-163.5 -109.5t-198.5 -40.5h-64q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h64q106 0 181 75t75 181v32q0 40 -28 68t-68 28h-224q-80 0 -136 56t-56 136v384q0 80 56 136t136 56h384q80 0 136 -56t56 -136z" />
+<glyph unicode="&#xf110;" horiz-adv-x="1568" d="M496 192q0 -60 -42.5 -102t-101.5 -42q-60 0 -102 42t-42 102t42 102t102 42q59 0 101.5 -42t42.5 -102zM928 0q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM320 640q0 -66 -47 -113t-113 -47t-113 47t-47 113 t47 113t113 47t113 -47t47 -113zM1360 192q0 -46 -33 -79t-79 -33t-79 33t-33 79t33 79t79 33t79 -33t33 -79zM528 1088q0 -73 -51.5 -124.5t-124.5 -51.5t-124.5 51.5t-51.5 124.5t51.5 124.5t124.5 51.5t124.5 -51.5t51.5 -124.5zM992 1280q0 -80 -56 -136t-136 -56 t-136 56t-56 136t56 136t136 56t136 -56t56 -136zM1536 640q0 -40 -28 -68t-68 -28t-68 28t-28 68t28 68t68 28t68 -28t28 -68zM1328 1088q0 -33 -23.5 -56.5t-56.5 -23.5t-56.5 23.5t-23.5 56.5t23.5 56.5t56.5 23.5t56.5 -23.5t23.5 -56.5z" />
+<glyph unicode="&#xf111;" d="M1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf112;" horiz-adv-x="1792" d="M1792 416q0 -166 -127 -451q-3 -7 -10.5 -24t-13.5 -30t-13 -22q-12 -17 -28 -17q-15 0 -23.5 10t-8.5 25q0 9 2.5 26.5t2.5 23.5q5 68 5 123q0 101 -17.5 181t-48.5 138.5t-80 101t-105.5 69.5t-133 42.5t-154 21.5t-175.5 6h-224v-256q0 -26 -19 -45t-45 -19t-45 19 l-512 512q-19 19 -19 45t19 45l512 512q19 19 45 19t45 -19t19 -45v-256h224q713 0 875 -403q53 -134 53 -333z" />
+<glyph unicode="&#xf113;" horiz-adv-x="1664" d="M640 320q0 -40 -12.5 -82t-43 -76t-72.5 -34t-72.5 34t-43 76t-12.5 82t12.5 82t43 76t72.5 34t72.5 -34t43 -76t12.5 -82zM1280 320q0 -40 -12.5 -82t-43 -76t-72.5 -34t-72.5 34t-43 76t-12.5 82t12.5 82t43 76t72.5 34t72.5 -34t43 -76t12.5 -82zM1440 320 q0 120 -69 204t-187 84q-41 0 -195 -21q-71 -11 -157 -11t-157 11q-152 21 -195 21q-118 0 -187 -84t-69 -204q0 -88 32 -153.5t81 -103t122 -60t140 -29.5t149 -7h168q82 0 149 7t140 29.5t122 60t81 103t32 153.5zM1664 496q0 -207 -61 -331q-38 -77 -105.5 -133t-141 -86 t-170 -47.5t-171.5 -22t-167 -4.5q-78 0 -142 3t-147.5 12.5t-152.5 30t-137 51.5t-121 81t-86 115q-62 123 -62 331q0 237 136 396q-27 82 -27 170q0 116 51 218q108 0 190 -39.5t189 -123.5q147 35 309 35q148 0 280 -32q105 82 187 121t189 39q51 -102 51 -218 q0 -87 -27 -168q136 -160 136 -398z" />
+<glyph unicode="&#xf114;" horiz-adv-x="1664" d="M1536 224v704q0 40 -28 68t-68 28h-704q-40 0 -68 28t-28 68v64q0 40 -28 68t-68 28h-320q-40 0 -68 -28t-28 -68v-960q0 -40 28 -68t68 -28h1216q40 0 68 28t28 68zM1664 928v-704q0 -92 -66 -158t-158 -66h-1216q-92 0 -158 66t-66 158v960q0 92 66 158t158 66h320 q92 0 158 -66t66 -158v-32h672q92 0 158 -66t66 -158z" />
+<glyph unicode="&#xf115;" horiz-adv-x="1920" d="M1781 605q0 35 -53 35h-1088q-40 0 -85.5 -21.5t-71.5 -52.5l-294 -363q-18 -24 -18 -40q0 -35 53 -35h1088q40 0 86 22t71 53l294 363q18 22 18 39zM640 768h768v160q0 40 -28 68t-68 28h-576q-40 0 -68 28t-28 68v64q0 40 -28 68t-68 28h-320q-40 0 -68 -28t-28 -68 v-853l256 315q44 53 116 87.5t140 34.5zM1909 605q0 -62 -46 -120l-295 -363q-43 -53 -116 -87.5t-140 -34.5h-1088q-92 0 -158 66t-66 158v960q0 92 66 158t158 66h320q92 0 158 -66t66 -158v-32h544q92 0 158 -66t66 -158v-160h192q54 0 99 -24.5t67 -70.5q15 -32 15 -68z " />
+<glyph unicode="&#xf116;" horiz-adv-x="1152" d="M896 608v-64q0 -14 -9 -23t-23 -9h-224v-224q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v224h-224q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h224v224q0 14 9 23t23 9h64q14 0 23 -9t9 -23v-224h224q14 0 23 -9t9 -23zM1024 224v704q0 40 -28 68t-68 28h-704q-40 0 -68 -28 t-28 -68v-704q0 -40 28 -68t68 -28h704q40 0 68 28t28 68zM1152 928v-704q0 -92 -65.5 -158t-158.5 -66h-704q-93 0 -158.5 66t-65.5 158v704q0 93 65.5 158.5t158.5 65.5h704q93 0 158.5 -65.5t65.5 -158.5z" />
+<glyph unicode="&#xf117;" horiz-adv-x="1152" d="M928 1152q93 0 158.5 -65.5t65.5 -158.5v-704q0 -92 -65.5 -158t-158.5 -66h-704q-93 0 -158.5 66t-65.5 158v704q0 93 65.5 158.5t158.5 65.5h704zM1024 224v704q0 40 -28 68t-68 28h-704q-40 0 -68 -28t-28 -68v-704q0 -40 28 -68t68 -28h704q40 0 68 28t28 68z M864 640q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-576q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h576z" />
+<glyph unicode="&#xf118;" d="M1134 461q-37 -121 -138 -195t-228 -74t-228 74t-138 195q-8 25 4 48.5t38 31.5q25 8 48.5 -4t31.5 -38q25 -80 92.5 -129.5t151.5 -49.5t151.5 49.5t92.5 129.5q8 26 32 38t49 4t37 -31.5t4 -48.5zM640 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5 t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1152 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1408 640q0 130 -51 248.5t-136.5 204t-204 136.5t-248.5 51t-248.5 -51t-204 -136.5t-136.5 -204t-51 -248.5 t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf119;" d="M1134 307q8 -25 -4 -48.5t-37 -31.5t-49 4t-32 38q-25 80 -92.5 129.5t-151.5 49.5t-151.5 -49.5t-92.5 -129.5q-8 -26 -31.5 -38t-48.5 -4q-26 8 -38 31.5t-4 48.5q37 121 138 195t228 74t228 -74t138 -195zM640 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5 t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1152 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1408 640q0 130 -51 248.5t-136.5 204t-204 136.5t-248.5 51t-248.5 -51t-204 -136.5t-136.5 -204 t-51 -248.5t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf11a;" d="M1152 448q0 -26 -19 -45t-45 -19h-640q-26 0 -45 19t-19 45t19 45t45 19h640q26 0 45 -19t19 -45zM640 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1152 896q0 -53 -37.5 -90.5t-90.5 -37.5t-90.5 37.5 t-37.5 90.5t37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1408 640q0 130 -51 248.5t-136.5 204t-204 136.5t-248.5 51t-248.5 -51t-204 -136.5t-136.5 -204t-51 -248.5t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf11b;" horiz-adv-x="1920" d="M832 448v128q0 14 -9 23t-23 9h-192v192q0 14 -9 23t-23 9h-128q-14 0 -23 -9t-9 -23v-192h-192q-14 0 -23 -9t-9 -23v-128q0 -14 9 -23t23 -9h192v-192q0 -14 9 -23t23 -9h128q14 0 23 9t9 23v192h192q14 0 23 9t9 23zM1408 384q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5 t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1664 640q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM1920 512q0 -212 -150 -362t-362 -150q-192 0 -338 128h-220q-146 -128 -338 -128q-212 0 -362 150 t-150 362t150 362t362 150h896q212 0 362 -150t150 -362z" />
+<glyph unicode="&#xf11c;" horiz-adv-x="1920" d="M384 368v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM512 624v-96q0 -16 -16 -16h-224q-16 0 -16 16v96q0 16 16 16h224q16 0 16 -16zM384 880v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1408 368v-96q0 -16 -16 -16 h-864q-16 0 -16 16v96q0 16 16 16h864q16 0 16 -16zM768 624v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM640 880v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1024 624v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16 h96q16 0 16 -16zM896 880v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1280 624v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1664 368v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1152 880v-96 q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1408 880v-96q0 -16 -16 -16h-96q-16 0 -16 16v96q0 16 16 16h96q16 0 16 -16zM1664 880v-352q0 -16 -16 -16h-224q-16 0 -16 16v96q0 16 16 16h112v240q0 16 16 16h96q16 0 16 -16zM1792 128v896h-1664v-896 h1664zM1920 1024v-896q0 -53 -37.5 -90.5t-90.5 -37.5h-1664q-53 0 -90.5 37.5t-37.5 90.5v896q0 53 37.5 90.5t90.5 37.5h1664q53 0 90.5 -37.5t37.5 -90.5z" />
+<glyph unicode="&#xf11d;" horiz-adv-x="1792" d="M1664 491v616q-169 -91 -306 -91q-82 0 -145 32q-100 49 -184 76.5t-178 27.5q-173 0 -403 -127v-599q245 113 433 113q55 0 103.5 -7.5t98 -26t77 -31t82.5 -39.5l28 -14q44 -22 101 -22q120 0 293 92zM320 1280q0 -35 -17.5 -64t-46.5 -46v-1266q0 -14 -9 -23t-23 -9 h-64q-14 0 -23 9t-9 23v1266q-29 17 -46.5 46t-17.5 64q0 53 37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1792 1216v-763q0 -39 -35 -57q-10 -5 -17 -9q-218 -116 -369 -116q-88 0 -158 35l-28 14q-64 33 -99 48t-91 29t-114 14q-102 0 -235.5 -44t-228.5 -102 q-15 -9 -33 -9q-16 0 -32 8q-32 19 -32 56v742q0 35 31 55q35 21 78.5 42.5t114 52t152.5 49.5t155 19q112 0 209 -31t209 -86q38 -19 89 -19q122 0 310 112q22 12 31 17q31 16 62 -2q31 -20 31 -55z" />
+<glyph unicode="&#xf11e;" horiz-adv-x="1792" d="M832 536v192q-181 -16 -384 -117v-185q205 96 384 110zM832 954v197q-172 -8 -384 -126v-189q215 111 384 118zM1664 491v184q-235 -116 -384 -71v224q-20 6 -39 15q-5 3 -33 17t-34.5 17t-31.5 15t-34.5 15.5t-32.5 13t-36 12.5t-35 8.5t-39.5 7.5t-39.5 4t-44 2 q-23 0 -49 -3v-222h19q102 0 192.5 -29t197.5 -82q19 -9 39 -15v-188q42 -17 91 -17q120 0 293 92zM1664 918v189q-169 -91 -306 -91q-45 0 -78 8v-196q148 -42 384 90zM320 1280q0 -35 -17.5 -64t-46.5 -46v-1266q0 -14 -9 -23t-23 -9h-64q-14 0 -23 9t-9 23v1266 q-29 17 -46.5 46t-17.5 64q0 53 37.5 90.5t90.5 37.5t90.5 -37.5t37.5 -90.5zM1792 1216v-763q0 -39 -35 -57q-10 -5 -17 -9q-218 -116 -369 -116q-88 0 -158 35l-28 14q-64 33 -99 48t-91 29t-114 14q-102 0 -235.5 -44t-228.5 -102q-15 -9 -33 -9q-16 0 -32 8 q-32 19 -32 56v742q0 35 31 55q35 21 78.5 42.5t114 52t152.5 49.5t155 19q112 0 209 -31t209 -86q38 -19 89 -19q122 0 310 112q22 12 31 17q31 16 62 -2q31 -20 31 -55z" />
+<glyph unicode="&#xf120;" horiz-adv-x="1664" d="M585 553l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23t-10 -23zM1664 96v-64q0 -14 -9 -23t-23 -9h-960q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h960q14 0 23 -9 t9 -23z" />
+<glyph unicode="&#xf121;" horiz-adv-x="1920" d="M617 137l-50 -50q-10 -10 -23 -10t-23 10l-466 466q-10 10 -10 23t10 23l466 466q10 10 23 10t23 -10l50 -50q10 -10 10 -23t-10 -23l-393 -393l393 -393q10 -10 10 -23t-10 -23zM1208 1204l-373 -1291q-4 -13 -15.5 -19.5t-23.5 -2.5l-62 17q-13 4 -19.5 15.5t-2.5 24.5 l373 1291q4 13 15.5 19.5t23.5 2.5l62 -17q13 -4 19.5 -15.5t2.5 -24.5zM1865 553l-466 -466q-10 -10 -23 -10t-23 10l-50 50q-10 10 -10 23t10 23l393 393l-393 393q-10 10 -10 23t10 23l50 50q10 10 23 10t23 -10l466 -466q10 -10 10 -23t-10 -23z" />
+<glyph unicode="&#xf122;" horiz-adv-x="1792" d="M640 454v-70q0 -42 -39 -59q-13 -5 -25 -5q-27 0 -45 19l-512 512q-19 19 -19 45t19 45l512 512q29 31 70 14q39 -17 39 -59v-69l-397 -398q-19 -19 -19 -45t19 -45zM1792 416q0 -58 -17 -133.5t-38.5 -138t-48 -125t-40.5 -90.5l-20 -40q-8 -17 -28 -17q-6 0 -9 1 q-25 8 -23 34q43 400 -106 565q-64 71 -170.5 110.5t-267.5 52.5v-251q0 -42 -39 -59q-13 -5 -25 -5q-27 0 -45 19l-512 512q-19 19 -19 45t19 45l512 512q29 31 70 14q39 -17 39 -59v-262q411 -28 599 -221q169 -173 169 -509z" />
+<glyph unicode="&#xf123;" horiz-adv-x="1664" d="M1186 579l257 250l-356 52l-66 10l-30 60l-159 322v-963l59 -31l318 -168l-60 355l-12 66zM1638 841l-363 -354l86 -500q5 -33 -6 -51.5t-34 -18.5q-17 0 -40 12l-449 236l-449 -236q-23 -12 -40 -12q-23 0 -34 18.5t-6 51.5l86 500l-364 354q-32 32 -23 59.5t54 34.5 l502 73l225 455q20 41 49 41q28 0 49 -41l225 -455l502 -73q45 -7 54 -34.5t-24 -59.5z" />
+<glyph unicode="&#xf124;" horiz-adv-x="1408" d="M1401 1187l-640 -1280q-17 -35 -57 -35q-5 0 -15 2q-22 5 -35.5 22.5t-13.5 39.5v576h-576q-22 0 -39.5 13.5t-22.5 35.5t4 42t29 30l1280 640q13 7 29 7q27 0 45 -19q15 -14 18.5 -34.5t-6.5 -39.5z" />
+<glyph unicode="&#xf125;" horiz-adv-x="1664" d="M557 256h595v595zM512 301l595 595h-595v-595zM1664 224v-192q0 -14 -9 -23t-23 -9h-224v-224q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v224h-864q-14 0 -23 9t-9 23v864h-224q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h224v224q0 14 9 23t23 9h192q14 0 23 -9t9 -23 v-224h851l246 247q10 9 23 9t23 -9q9 -10 9 -23t-9 -23l-247 -246v-851h224q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf126;" horiz-adv-x="1024" d="M288 64q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM288 1216q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM928 1088q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM1024 1088q0 -52 -26 -96.5t-70 -69.5 q-2 -287 -226 -414q-68 -38 -203 -81q-128 -40 -169.5 -71t-41.5 -100v-26q44 -25 70 -69.5t26 -96.5q0 -80 -56 -136t-136 -56t-136 56t-56 136q0 52 26 96.5t70 69.5v820q-44 25 -70 69.5t-26 96.5q0 80 56 136t136 56t136 -56t56 -136q0 -52 -26 -96.5t-70 -69.5v-497 q54 26 154 57q55 17 87.5 29.5t70.5 31t59 39.5t40.5 51t28 69.5t8.5 91.5q-44 25 -70 69.5t-26 96.5q0 80 56 136t136 56t136 -56t56 -136z" />
+<glyph unicode="&#xf127;" horiz-adv-x="1664" d="M439 265l-256 -256q-10 -9 -23 -9q-12 0 -23 9q-9 10 -9 23t9 23l256 256q10 9 23 9t23 -9q9 -10 9 -23t-9 -23zM608 224v-320q0 -14 -9 -23t-23 -9t-23 9t-9 23v320q0 14 9 23t23 9t23 -9t9 -23zM384 448q0 -14 -9 -23t-23 -9h-320q-14 0 -23 9t-9 23t9 23t23 9h320 q14 0 23 -9t9 -23zM1648 320q0 -120 -85 -203l-147 -146q-83 -83 -203 -83q-121 0 -204 85l-334 335q-21 21 -42 56l239 18l273 -274q27 -27 68 -27.5t68 26.5l147 146q28 28 28 67q0 40 -28 68l-274 275l18 239q35 -21 56 -42l336 -336q84 -86 84 -204zM1031 1044l-239 -18 l-273 274q-28 28 -68 28q-39 0 -68 -27l-147 -146q-28 -28 -28 -67q0 -40 28 -68l274 -274l-18 -240q-35 21 -56 42l-336 336q-84 86 -84 204q0 120 85 203l147 146q83 83 203 83q121 0 204 -85l334 -335q21 -21 42 -56zM1664 960q0 -14 -9 -23t-23 -9h-320q-14 0 -23 9 t-9 23t9 23t23 9h320q14 0 23 -9t9 -23zM1120 1504v-320q0 -14 -9 -23t-23 -9t-23 9t-9 23v320q0 14 9 23t23 9t23 -9t9 -23zM1527 1353l-256 -256q-11 -9 -23 -9t-23 9q-9 10 -9 23t9 23l256 256q10 9 23 9t23 -9q9 -10 9 -23t-9 -23z" />
+<glyph unicode="&#xf128;" horiz-adv-x="1024" d="M704 280v-240q0 -16 -12 -28t-28 -12h-240q-16 0 -28 12t-12 28v240q0 16 12 28t28 12h240q16 0 28 -12t12 -28zM1020 880q0 -54 -15.5 -101t-35 -76.5t-55 -59.5t-57.5 -43.5t-61 -35.5q-41 -23 -68.5 -65t-27.5 -67q0 -17 -12 -32.5t-28 -15.5h-240q-15 0 -25.5 18.5 t-10.5 37.5v45q0 83 65 156.5t143 108.5q59 27 84 56t25 76q0 42 -46.5 74t-107.5 32q-65 0 -108 -29q-35 -25 -107 -115q-13 -16 -31 -16q-12 0 -25 8l-164 125q-13 10 -15.5 25t5.5 28q160 266 464 266q80 0 161 -31t146 -83t106 -127.5t41 -158.5z" />
+<glyph unicode="&#xf129;" horiz-adv-x="640" d="M640 192v-128q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h64v384h-64q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h384q26 0 45 -19t19 -45v-576h64q26 0 45 -19t19 -45zM512 1344v-192q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v192 q0 26 19 45t45 19h256q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf12a;" horiz-adv-x="640" d="M512 288v-224q0 -26 -19 -45t-45 -19h-256q-26 0 -45 19t-19 45v224q0 26 19 45t45 19h256q26 0 45 -19t19 -45zM542 1344l-28 -768q-1 -26 -20.5 -45t-45.5 -19h-256q-26 0 -45.5 19t-20.5 45l-28 768q-1 26 17.5 45t44.5 19h320q26 0 44.5 -19t17.5 -45z" />
+<glyph unicode="&#xf12b;" d="M897 167v-167h-248l-159 252l-24 42q-8 9 -11 21h-3l-9 -21q-10 -20 -25 -44l-155 -250h-258v167h128l197 291l-185 272h-137v168h276l139 -228q2 -4 23 -42q8 -9 11 -21h3q3 9 11 21l25 42l140 228h257v-168h-125l-184 -267l204 -296h109zM1534 846v-206h-514l-3 27 q-4 28 -4 46q0 64 26 117t65 86.5t84 65t84 54.5t65 54t26 64q0 38 -29.5 62.5t-70.5 24.5q-51 0 -97 -39q-14 -11 -36 -38l-105 92q26 37 63 66q83 65 188 65q110 0 178 -59.5t68 -158.5q0 -56 -24.5 -103t-62 -76.5t-81.5 -58.5t-82 -50.5t-65.5 -51.5t-30.5 -63h232v80 h126z" />
+<glyph unicode="&#xf12c;" d="M897 167v-167h-248l-159 252l-24 42q-8 9 -11 21h-3l-9 -21q-10 -20 -25 -44l-155 -250h-258v167h128l197 291l-185 272h-137v168h276l139 -228q2 -4 23 -42q8 -9 11 -21h3q3 9 11 21l25 42l140 228h257v-168h-125l-184 -267l204 -296h109zM1536 -50v-206h-514l-4 27 q-3 45 -3 46q0 64 26 117t65 86.5t84 65t84 54.5t65 54t26 64q0 38 -29.5 62.5t-70.5 24.5q-51 0 -97 -39q-14 -11 -36 -38l-105 92q26 37 63 66q80 65 188 65q110 0 178 -59.5t68 -158.5q0 -66 -34.5 -118.5t-84 -86t-99.5 -62.5t-87 -63t-41 -73h232v80h126z" />
+<glyph unicode="&#xf12d;" horiz-adv-x="1920" d="M896 128l336 384h-768l-336 -384h768zM1909 1205q15 -34 9.5 -71.5t-30.5 -65.5l-896 -1024q-38 -44 -96 -44h-768q-38 0 -69.5 20.5t-47.5 54.5q-15 34 -9.5 71.5t30.5 65.5l896 1024q38 44 96 44h768q38 0 69.5 -20.5t47.5 -54.5z" />
+<glyph unicode="&#xf12e;" horiz-adv-x="1664" d="M1664 438q0 -81 -44.5 -135t-123.5 -54q-41 0 -77.5 17.5t-59 38t-56.5 38t-71 17.5q-110 0 -110 -124q0 -39 16 -115t15 -115v-5q-22 0 -33 -1q-34 -3 -97.5 -11.5t-115.5 -13.5t-98 -5q-61 0 -103 26.5t-42 83.5q0 37 17.5 71t38 56.5t38 59t17.5 77.5q0 79 -54 123.5 t-135 44.5q-84 0 -143 -45.5t-59 -127.5q0 -43 15 -83t33.5 -64.5t33.5 -53t15 -50.5q0 -45 -46 -89q-37 -35 -117 -35q-95 0 -245 24q-9 2 -27.5 4t-27.5 4l-13 2q-1 0 -3 1q-2 0 -2 1v1024q2 -1 17.5 -3.5t34 -5t21.5 -3.5q150 -24 245 -24q80 0 117 35q46 44 46 89 q0 22 -15 50.5t-33.5 53t-33.5 64.5t-15 83q0 82 59 127.5t144 45.5q80 0 134 -44.5t54 -123.5q0 -41 -17.5 -77.5t-38 -59t-38 -56.5t-17.5 -71q0 -57 42 -83.5t103 -26.5q64 0 180 15t163 17v-2q-1 -2 -3.5 -17.5t-5 -34t-3.5 -21.5q-24 -150 -24 -245q0 -80 35 -117 q44 -46 89 -46q22 0 50.5 15t53 33.5t64.5 33.5t83 15q82 0 127.5 -59t45.5 -143z" />
+<glyph unicode="&#xf130;" horiz-adv-x="1152" d="M1152 832v-128q0 -221 -147.5 -384.5t-364.5 -187.5v-132h256q26 0 45 -19t19 -45t-19 -45t-45 -19h-640q-26 0 -45 19t-19 45t19 45t45 19h256v132q-217 24 -364.5 187.5t-147.5 384.5v128q0 26 19 45t45 19t45 -19t19 -45v-128q0 -185 131.5 -316.5t316.5 -131.5 t316.5 131.5t131.5 316.5v128q0 26 19 45t45 19t45 -19t19 -45zM896 1216v-512q0 -132 -94 -226t-226 -94t-226 94t-94 226v512q0 132 94 226t226 94t226 -94t94 -226z" />
+<glyph unicode="&#xf131;" horiz-adv-x="1408" d="M271 591l-101 -101q-42 103 -42 214v128q0 26 19 45t45 19t45 -19t19 -45v-128q0 -53 15 -113zM1385 1193l-361 -361v-128q0 -132 -94 -226t-226 -94q-55 0 -109 19l-96 -96q97 -51 205 -51q185 0 316.5 131.5t131.5 316.5v128q0 26 19 45t45 19t45 -19t19 -45v-128 q0 -221 -147.5 -384.5t-364.5 -187.5v-132h256q26 0 45 -19t19 -45t-19 -45t-45 -19h-640q-26 0 -45 19t-19 45t19 45t45 19h256v132q-125 13 -235 81l-254 -254q-10 -10 -23 -10t-23 10l-82 82q-10 10 -10 23t10 23l1234 1234q10 10 23 10t23 -10l82 -82q10 -10 10 -23 t-10 -23zM1005 1325l-621 -621v512q0 132 94 226t226 94q102 0 184.5 -59t116.5 -152z" />
+<glyph unicode="&#xf132;" horiz-adv-x="1280" d="M1088 576v640h-448v-1137q119 63 213 137q235 184 235 360zM1280 1344v-768q0 -86 -33.5 -170.5t-83 -150t-118 -127.5t-126.5 -103t-121 -77.5t-89.5 -49.5t-42.5 -20q-12 -6 -26 -6t-26 6q-16 7 -42.5 20t-89.5 49.5t-121 77.5t-126.5 103t-118 127.5t-83 150 t-33.5 170.5v768q0 26 19 45t45 19h1152q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf133;" horiz-adv-x="1664" d="M128 -128h1408v1024h-1408v-1024zM512 1088v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1280 1088v288q0 14 -9 23t-23 9h-64q-14 0 -23 -9t-9 -23v-288q0 -14 9 -23t23 -9h64q14 0 23 9t9 23zM1664 1152v-1280 q0 -52 -38 -90t-90 -38h-1408q-52 0 -90 38t-38 90v1280q0 52 38 90t90 38h128v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h384v96q0 66 47 113t113 47h64q66 0 113 -47t47 -113v-96h128q52 0 90 -38t38 -90z" />
+<glyph unicode="&#xf134;" horiz-adv-x="1408" d="M512 1344q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1408 1376v-320q0 -16 -12 -25q-8 -7 -20 -7q-4 0 -7 1l-448 96q-11 2 -18 11t-7 20h-256v-102q111 -23 183.5 -111t72.5 -203v-800q0 -26 -19 -45t-45 -19h-512q-26 0 -45 19t-19 45v800 q0 106 62.5 190.5t161.5 114.5v111h-32q-59 0 -115 -23.5t-91.5 -53t-66 -66.5t-40.5 -53.5t-14 -24.5q-17 -35 -57 -35q-16 0 -29 7q-23 12 -31.5 37t3.5 49q5 10 14.5 26t37.5 53.5t60.5 70t85 67t108.5 52.5q-25 42 -25 86q0 66 47 113t113 47t113 -47t47 -113 q0 -33 -14 -64h302q0 11 7 20t18 11l448 96q3 1 7 1q12 0 20 -7q12 -9 12 -25z" />
+<glyph unicode="&#xf135;" horiz-adv-x="1664" d="M1440 1088q0 40 -28 68t-68 28t-68 -28t-28 -68t28 -68t68 -28t68 28t28 68zM1664 1376q0 -249 -75.5 -430.5t-253.5 -360.5q-81 -80 -195 -176l-20 -379q-2 -16 -16 -26l-384 -224q-7 -4 -16 -4q-12 0 -23 9l-64 64q-13 14 -8 32l85 276l-281 281l-276 -85q-3 -1 -9 -1 q-14 0 -23 9l-64 64q-17 19 -5 39l224 384q10 14 26 16l379 20q96 114 176 195q188 187 358 258t431 71q14 0 24 -9.5t10 -22.5z" />
+<glyph unicode="&#xf136;" horiz-adv-x="1792" d="M1708 881l-188 -881h-304l181 849q4 21 1 43q-4 20 -16 35q-10 14 -28 24q-18 9 -40 9h-197l-205 -960h-303l204 960h-304l-205 -960h-304l272 1280h1139q157 0 245 -118q86 -116 52 -281z" />
+<glyph unicode="&#xf137;" d="M909 141l102 102q19 19 19 45t-19 45l-307 307l307 307q19 19 19 45t-19 45l-102 102q-19 19 -45 19t-45 -19l-454 -454q-19 -19 -19 -45t19 -45l454 -454q19 -19 45 -19t45 19zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf138;" d="M717 141l454 454q19 19 19 45t-19 45l-454 454q-19 19 -45 19t-45 -19l-102 -102q-19 -19 -19 -45t19 -45l307 -307l-307 -307q-19 -19 -19 -45t19 -45l102 -102q19 -19 45 -19t45 19zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf139;" d="M1165 397l102 102q19 19 19 45t-19 45l-454 454q-19 19 -45 19t-45 -19l-454 -454q-19 -19 -19 -45t19 -45l102 -102q19 -19 45 -19t45 19l307 307l307 -307q19 -19 45 -19t45 19zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf13a;" d="M813 237l454 454q19 19 19 45t-19 45l-102 102q-19 19 -45 19t-45 -19l-307 -307l-307 307q-19 19 -45 19t-45 -19l-102 -102q-19 -19 -19 -45t19 -45l454 -454q19 -19 45 -19t45 19zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5 t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf13b;" horiz-adv-x="1408" d="M1130 939l16 175h-884l47 -534h612l-22 -228l-197 -53l-196 53l-13 140h-175l22 -278l362 -100h4v1l359 99l50 544h-644l-15 181h674zM0 1408h1408l-128 -1438l-578 -162l-574 162z" />
+<glyph unicode="&#xf13c;" horiz-adv-x="1792" d="M275 1408h1505l-266 -1333l-804 -267l-698 267l71 356h297l-29 -147l422 -161l486 161l68 339h-1208l58 297h1209l38 191h-1208z" />
+<glyph unicode="&#xf13d;" horiz-adv-x="1792" d="M960 1280q0 26 -19 45t-45 19t-45 -19t-19 -45t19 -45t45 -19t45 19t19 45zM1792 352v-352q0 -22 -20 -30q-8 -2 -12 -2q-13 0 -23 9l-93 93q-119 -143 -318.5 -226.5t-429.5 -83.5t-429.5 83.5t-318.5 226.5l-93 -93q-9 -9 -23 -9q-4 0 -12 2q-20 8 -20 30v352 q0 14 9 23t23 9h352q22 0 30 -20q8 -19 -7 -35l-100 -100q67 -91 189.5 -153.5t271.5 -82.5v647h-192q-26 0 -45 19t-19 45v128q0 26 19 45t45 19h192v163q-58 34 -93 92.5t-35 128.5q0 106 75 181t181 75t181 -75t75 -181q0 -70 -35 -128.5t-93 -92.5v-163h192q26 0 45 -19 t19 -45v-128q0 -26 -19 -45t-45 -19h-192v-647q149 20 271.5 82.5t189.5 153.5l-100 100q-15 16 -7 35q8 20 30 20h352q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf13e;" horiz-adv-x="1152" d="M1056 768q40 0 68 -28t28 -68v-576q0 -40 -28 -68t-68 -28h-960q-40 0 -68 28t-28 68v576q0 40 28 68t68 28h32v320q0 185 131.5 316.5t316.5 131.5t316.5 -131.5t131.5 -316.5q0 -26 -19 -45t-45 -19h-64q-26 0 -45 19t-19 45q0 106 -75 181t-181 75t-181 -75t-75 -181 v-320h736z" />
+<glyph unicode="&#xf140;" d="M1024 640q0 -106 -75 -181t-181 -75t-181 75t-75 181t75 181t181 75t181 -75t75 -181zM1152 640q0 159 -112.5 271.5t-271.5 112.5t-271.5 -112.5t-112.5 -271.5t112.5 -271.5t271.5 -112.5t271.5 112.5t112.5 271.5zM1280 640q0 -212 -150 -362t-362 -150t-362 150 t-150 362t150 362t362 150t362 -150t150 -362zM1408 640q0 130 -51 248.5t-136.5 204t-204 136.5t-248.5 51t-248.5 -51t-204 -136.5t-136.5 -204t-51 -248.5t51 -248.5t136.5 -204t204 -136.5t248.5 -51t248.5 51t204 136.5t136.5 204t51 248.5zM1536 640 q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf141;" horiz-adv-x="1408" d="M384 800v-192q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68zM896 800v-192q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68zM1408 800v-192q0 -40 -28 -68t-68 -28h-192 q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68z" />
+<glyph unicode="&#xf142;" horiz-adv-x="384" d="M384 288v-192q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68zM384 800v-192q0 -40 -28 -68t-68 -28h-192q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68zM384 1312v-192q0 -40 -28 -68t-68 -28h-192 q-40 0 -68 28t-28 68v192q0 40 28 68t68 28h192q40 0 68 -28t28 -68z" />
+<glyph unicode="&#xf143;" d="M512 256q0 53 -37.5 90.5t-90.5 37.5t-90.5 -37.5t-37.5 -90.5t37.5 -90.5t90.5 -37.5t90.5 37.5t37.5 90.5zM863 162q-13 232 -177 396t-396 177q-14 1 -24 -9t-10 -23v-128q0 -13 8.5 -22t21.5 -10q154 -11 264 -121t121 -264q1 -13 10 -21.5t22 -8.5h128q13 0 23 10 t9 24zM1247 161q-5 154 -56 297.5t-139.5 260t-205 205t-260 139.5t-297.5 56q-14 1 -23 -9q-10 -10 -10 -23v-128q0 -13 9 -22t22 -10q204 -7 378 -111.5t278.5 -278.5t111.5 -378q1 -13 10 -22t22 -9h128q13 0 23 10q11 9 9 23zM1536 1120v-960q0 -119 -84.5 -203.5 t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf144;" d="M768 1408q209 0 385.5 -103t279.5 -279.5t103 -385.5t-103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103zM1152 585q32 18 32 55t-32 55l-544 320q-31 19 -64 1q-32 -19 -32 -56v-640q0 -37 32 -56 q16 -8 32 -8q17 0 32 9z" />
+<glyph unicode="&#xf145;" horiz-adv-x="1792" d="M1024 1084l316 -316l-572 -572l-316 316zM813 105l618 618q19 19 19 45t-19 45l-362 362q-18 18 -45 18t-45 -18l-618 -618q-19 -19 -19 -45t19 -45l362 -362q18 -18 45 -18t45 18zM1702 742l-907 -908q-37 -37 -90.5 -37t-90.5 37l-126 126q56 56 56 136t-56 136 t-136 56t-136 -56l-125 126q-37 37 -37 90.5t37 90.5l907 906q37 37 90.5 37t90.5 -37l125 -125q-56 -56 -56 -136t56 -136t136 -56t136 56l126 -125q37 -37 37 -90.5t-37 -90.5z" />
+<glyph unicode="&#xf146;" d="M1280 576v128q0 26 -19 45t-45 19h-896q-26 0 -45 -19t-19 -45v-128q0 -26 19 -45t45 -19h896q26 0 45 19t19 45zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5 t84.5 -203.5z" />
+<glyph unicode="&#xf147;" horiz-adv-x="1408" d="M1152 736v-64q0 -14 -9 -23t-23 -9h-832q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h832q14 0 23 -9t9 -23zM1280 288v832q0 66 -47 113t-113 47h-832q-66 0 -113 -47t-47 -113v-832q0 -66 47 -113t113 -47h832q66 0 113 47t47 113zM1408 1120v-832q0 -119 -84.5 -203.5 t-203.5 -84.5h-832q-119 0 -203.5 84.5t-84.5 203.5v832q0 119 84.5 203.5t203.5 84.5h832q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf148;" horiz-adv-x="1024" d="M1018 933q-18 -37 -58 -37h-192v-864q0 -14 -9 -23t-23 -9h-704q-21 0 -29 18q-8 20 4 35l160 192q9 11 25 11h320v640h-192q-40 0 -58 37q-17 37 9 68l320 384q18 22 49 22t49 -22l320 -384q27 -32 9 -68z" />
+<glyph unicode="&#xf149;" horiz-adv-x="1024" d="M32 1280h704q13 0 22.5 -9.5t9.5 -23.5v-863h192q40 0 58 -37t-9 -69l-320 -384q-18 -22 -49 -22t-49 22l-320 384q-26 31 -9 69q18 37 58 37h192v640h-320q-14 0 -25 11l-160 192q-13 14 -4 34q9 19 29 19z" />
+<glyph unicode="&#xf14a;" d="M685 237l614 614q19 19 19 45t-19 45l-102 102q-19 19 -45 19t-45 -19l-467 -467l-211 211q-19 19 -45 19t-45 -19l-102 -102q-19 -19 -19 -45t19 -45l358 -358q19 -19 45 -19t45 19zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5 t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf14b;" d="M404 428l152 -152l-52 -52h-56v96h-96v56zM818 818q14 -13 -3 -30l-291 -291q-17 -17 -30 -3q-14 13 3 30l291 291q17 17 30 3zM544 128l544 544l-288 288l-544 -544v-288h288zM1152 736l92 92q28 28 28 68t-28 68l-152 152q-28 28 -68 28t-68 -28l-92 -92zM1536 1120 v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf14c;" d="M1280 608v480q0 26 -19 45t-45 19h-480q-42 0 -59 -39q-17 -41 14 -70l144 -144l-534 -534q-19 -19 -19 -45t19 -45l102 -102q19 -19 45 -19t45 19l534 534l144 -144q18 -19 45 -19q12 0 25 5q39 17 39 59zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960 q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf14d;" d="M1005 435l352 352q19 19 19 45t-19 45l-352 352q-30 31 -69 14q-40 -17 -40 -59v-160q-119 0 -216 -19.5t-162.5 -51t-114 -79t-76.5 -95.5t-44.5 -109t-21.5 -111.5t-5 -110.5q0 -181 167 -404q10 -12 25 -12q7 0 13 3q22 9 19 33q-44 354 62 473q46 52 130 75.5 t224 23.5v-160q0 -42 40 -59q12 -5 24 -5q26 0 45 19zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf14e;" d="M640 448l256 128l-256 128v-256zM1024 1039v-542l-512 -256v542zM1312 640q0 148 -73 273t-198 198t-273 73t-273 -73t-198 -198t-73 -273t73 -273t198 -198t273 -73t273 73t198 198t73 273zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103 t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf150;" d="M1145 861q18 -35 -5 -66l-320 -448q-19 -27 -52 -27t-52 27l-320 448q-23 31 -5 66q17 35 57 35h640q40 0 57 -35zM1280 160v960q0 13 -9.5 22.5t-22.5 9.5h-960q-13 0 -22.5 -9.5t-9.5 -22.5v-960q0 -13 9.5 -22.5t22.5 -9.5h960q13 0 22.5 9.5t9.5 22.5zM1536 1120 v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf151;" d="M1145 419q-17 -35 -57 -35h-640q-40 0 -57 35q-18 35 5 66l320 448q19 27 52 27t52 -27l320 -448q23 -31 5 -66zM1280 160v960q0 13 -9.5 22.5t-22.5 9.5h-960q-13 0 -22.5 -9.5t-9.5 -22.5v-960q0 -13 9.5 -22.5t22.5 -9.5h960q13 0 22.5 9.5t9.5 22.5zM1536 1120v-960 q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf152;" d="M1088 640q0 -33 -27 -52l-448 -320q-31 -23 -66 -5q-35 17 -35 57v640q0 40 35 57q35 18 66 -5l448 -320q27 -19 27 -52zM1280 160v960q0 14 -9 23t-23 9h-960q-14 0 -23 -9t-9 -23v-960q0 -14 9 -23t23 -9h960q14 0 23 9t9 23zM1536 1120v-960q0 -119 -84.5 -203.5 t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf153;" horiz-adv-x="1024" d="M976 229l35 -159q3 -12 -3 -22.5t-17 -14.5l-5 -1q-4 -2 -10.5 -3.5t-16 -4.5t-21.5 -5.5t-25.5 -5t-30 -5t-33.5 -4.5t-36.5 -3t-38.5 -1q-234 0 -409 130.5t-238 351.5h-95q-13 0 -22.5 9.5t-9.5 22.5v113q0 13 9.5 22.5t22.5 9.5h66q-2 57 1 105h-67q-14 0 -23 9 t-9 23v114q0 14 9 23t23 9h98q67 210 243.5 338t400.5 128q102 0 194 -23q11 -3 20 -15q6 -11 3 -24l-43 -159q-3 -13 -14 -19.5t-24 -2.5l-4 1q-4 1 -11.5 2.5l-17.5 3.5t-22.5 3.5t-26 3t-29 2.5t-29.5 1q-126 0 -226 -64t-150 -176h468q16 0 25 -12q10 -12 7 -26 l-24 -114q-5 -26 -32 -26h-488q-3 -37 0 -105h459q15 0 25 -12q9 -12 6 -27l-24 -112q-2 -11 -11 -18.5t-20 -7.5h-387q48 -117 149.5 -185.5t228.5 -68.5q18 0 36 1.5t33.5 3.5t29.5 4.5t24.5 5t18.5 4.5l12 3l5 2q13 5 26 -2q12 -7 15 -21z" />
+<glyph unicode="&#xf154;" horiz-adv-x="1024" d="M1020 399v-367q0 -14 -9 -23t-23 -9h-956q-14 0 -23 9t-9 23v150q0 13 9.5 22.5t22.5 9.5h97v383h-95q-14 0 -23 9.5t-9 22.5v131q0 14 9 23t23 9h95v223q0 171 123.5 282t314.5 111q185 0 335 -125q9 -8 10 -20.5t-7 -22.5l-103 -127q-9 -11 -22 -12q-13 -2 -23 7 q-5 5 -26 19t-69 32t-93 18q-85 0 -137 -47t-52 -123v-215h305q13 0 22.5 -9t9.5 -23v-131q0 -13 -9.5 -22.5t-22.5 -9.5h-305v-379h414v181q0 13 9 22.5t23 9.5h162q14 0 23 -9.5t9 -22.5z" />
+<glyph unicode="&#xf155;" horiz-adv-x="1024" d="M978 351q0 -153 -99.5 -263.5t-258.5 -136.5v-175q0 -14 -9 -23t-23 -9h-135q-13 0 -22.5 9.5t-9.5 22.5v175q-66 9 -127.5 31t-101.5 44.5t-74 48t-46.5 37.5t-17.5 18q-17 21 -2 41l103 135q7 10 23 12q15 2 24 -9l2 -2q113 -99 243 -125q37 -8 74 -8q81 0 142.5 43 t61.5 122q0 28 -15 53t-33.5 42t-58.5 37.5t-66 32t-80 32.5q-39 16 -61.5 25t-61.5 26.5t-62.5 31t-56.5 35.5t-53.5 42.5t-43.5 49t-35.5 58t-21 66.5t-8.5 78q0 138 98 242t255 134v180q0 13 9.5 22.5t22.5 9.5h135q14 0 23 -9t9 -23v-176q57 -6 110.5 -23t87 -33.5 t63.5 -37.5t39 -29t15 -14q17 -18 5 -38l-81 -146q-8 -15 -23 -16q-14 -3 -27 7q-3 3 -14.5 12t-39 26.5t-58.5 32t-74.5 26t-85.5 11.5q-95 0 -155 -43t-60 -111q0 -26 8.5 -48t29.5 -41.5t39.5 -33t56 -31t60.5 -27t70 -27.5q53 -20 81 -31.5t76 -35t75.5 -42.5t62 -50 t53 -63.5t31.5 -76.5t13 -94z" />
+<glyph unicode="&#xf156;" horiz-adv-x="898" d="M898 1066v-102q0 -14 -9 -23t-23 -9h-168q-23 -144 -129 -234t-276 -110q167 -178 459 -536q14 -16 4 -34q-8 -18 -29 -18h-195q-16 0 -25 12q-306 367 -498 571q-9 9 -9 22v127q0 13 9.5 22.5t22.5 9.5h112q132 0 212.5 43t102.5 125h-427q-14 0 -23 9t-9 23v102 q0 14 9 23t23 9h413q-57 113 -268 113h-145q-13 0 -22.5 9.5t-9.5 22.5v133q0 14 9 23t23 9h832q14 0 23 -9t9 -23v-102q0 -14 -9 -23t-23 -9h-233q47 -61 64 -144h171q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf157;" horiz-adv-x="1027" d="M603 0h-172q-13 0 -22.5 9t-9.5 23v330h-288q-13 0 -22.5 9t-9.5 23v103q0 13 9.5 22.5t22.5 9.5h288v85h-288q-13 0 -22.5 9t-9.5 23v104q0 13 9.5 22.5t22.5 9.5h214l-321 578q-8 16 0 32q10 16 28 16h194q19 0 29 -18l215 -425q19 -38 56 -125q10 24 30.5 68t27.5 61 l191 420q8 19 29 19h191q17 0 27 -16q9 -14 1 -31l-313 -579h215q13 0 22.5 -9.5t9.5 -22.5v-104q0 -14 -9.5 -23t-22.5 -9h-290v-85h290q13 0 22.5 -9.5t9.5 -22.5v-103q0 -14 -9.5 -23t-22.5 -9h-290v-330q0 -13 -9.5 -22.5t-22.5 -9.5z" />
+<glyph unicode="&#xf158;" horiz-adv-x="1664" d="M1664 352v-32q0 -132 -94 -226t-226 -94h-128q-132 0 -226 94t-94 226v480h-224q-2 -102 -14.5 -190.5t-30.5 -156t-48.5 -126.5t-57 -99.5t-67.5 -77.5t-69.5 -58.5t-74 -44t-69 -32t-65.5 -25.5q-4 -2 -32 -13q-8 -2 -12 -2q-22 0 -30 20l-71 178q-5 13 0 25t17 17 q7 3 20 7.5t18 6.5q31 12 46.5 18.5t44.5 20t45.5 26t42 32.5t40.5 42.5t34.5 53.5t30.5 68.5t22.5 83.5t17 103t6.5 123h-256q-14 0 -23 9t-9 23v160q0 14 9 23t23 9h1216q14 0 23 -9t9 -23v-160q0 -14 -9 -23t-23 -9h-224v-512q0 -26 19 -45t45 -19h128q26 0 45 19t19 45 v64q0 14 9 23t23 9h192q14 0 23 -9t9 -23zM1280 1376v-160q0 -14 -9 -23t-23 -9h-960q-14 0 -23 9t-9 23v160q0 14 9 23t23 9h960q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf159;" horiz-adv-x="1792" d="M514 341l81 299h-159l75 -300q1 -1 1 -3t1 -3q0 1 0.5 3.5t0.5 3.5zM630 768l35 128h-292l32 -128h225zM822 768h139l-35 128h-70zM1271 340l78 300h-162l81 -299q0 -1 0.5 -3.5t1.5 -3.5q0 1 0.5 3t0.5 3zM1382 768l33 128h-297l34 -128h230zM1792 736v-64q0 -14 -9 -23 t-23 -9h-213l-164 -616q-7 -24 -31 -24h-159q-24 0 -31 24l-166 616h-209l-167 -616q-7 -24 -31 -24h-159q-11 0 -19.5 7t-10.5 17l-160 616h-208q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h175l-33 128h-142q-14 0 -23 9t-9 23v64q0 14 9 23t23 9h109l-89 344q-5 15 5 28 q10 12 26 12h137q26 0 31 -24l90 -360h359l97 360q7 24 31 24h126q24 0 31 -24l98 -360h365l93 360q5 24 31 24h137q16 0 26 -12q10 -13 5 -28l-91 -344h111q14 0 23 -9t9 -23v-64q0 -14 -9 -23t-23 -9h-145l-34 -128h179q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf15a;" horiz-adv-x="1280" d="M1167 896q18 -182 -131 -258q117 -28 175 -103t45 -214q-7 -71 -32.5 -125t-64.5 -89t-97 -58.5t-121.5 -34.5t-145.5 -15v-255h-154v251q-80 0 -122 1v-252h-154v255q-18 0 -54 0.5t-55 0.5h-200l31 183h111q50 0 58 51v402h16q-6 1 -16 1v287q-13 68 -89 68h-111v164 l212 -1q64 0 97 1v252h154v-247q82 2 122 2v245h154v-252q79 -7 140 -22.5t113 -45t82.5 -78t36.5 -114.5zM952 351q0 36 -15 64t-37 46t-57.5 30.5t-65.5 18.5t-74 9t-69 3t-64.5 -1t-47.5 -1v-338q8 0 37 -0.5t48 -0.5t53 1.5t58.5 4t57 8.5t55.5 14t47.5 21t39.5 30 t24.5 40t9.5 51zM881 827q0 33 -12.5 58.5t-30.5 42t-48 28t-55 16.5t-61.5 8t-58 2.5t-54 -1t-39.5 -0.5v-307q5 0 34.5 -0.5t46.5 0t50 2t55 5.5t51.5 11t48.5 18.5t37 27t27 38.5t9 51z" />
+<glyph unicode="&#xf15b;" horiz-adv-x="1280" d="M1280 768v-800q0 -40 -28 -68t-68 -28h-1088q-40 0 -68 28t-28 68v1344q0 40 28 68t68 28h544v-544q0 -40 28 -68t68 -28h544zM1277 896h-509v509q82 -15 132 -65l312 -312q50 -50 65 -132z" />
+<glyph unicode="&#xf15c;" horiz-adv-x="1280" d="M1024 160v64q0 14 -9 23t-23 9h-704q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h704q14 0 23 9t9 23zM1024 416v64q0 14 -9 23t-23 9h-704q-14 0 -23 -9t-9 -23v-64q0 -14 9 -23t23 -9h704q14 0 23 9t9 23zM1280 768v-800q0 -40 -28 -68t-68 -28h-1088q-40 0 -68 28 t-28 68v1344q0 40 28 68t68 28h544v-544q0 -40 28 -68t68 -28h544zM1277 896h-509v509q82 -15 132 -65l312 -312q50 -50 65 -132z" />
+<glyph unicode="&#xf15d;" horiz-adv-x="1664" d="M1191 1128h177l-72 218l-12 47q-2 16 -2 20h-4l-3 -20q0 -1 -3.5 -18t-7.5 -29zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9t9 -23zM1572 -23 v-233h-584v90l369 529q12 18 21 27l11 9v3q-2 0 -6.5 -0.5t-7.5 -0.5q-12 -3 -30 -3h-232v-115h-120v229h567v-89l-369 -530q-6 -8 -21 -26l-11 -11v-2l14 2q9 2 30 2h248v119h121zM1661 874v-106h-288v106h75l-47 144h-243l-47 -144h75v-106h-287v106h70l230 662h162 l230 -662h70z" />
+<glyph unicode="&#xf15e;" horiz-adv-x="1664" d="M1191 104h177l-72 218l-12 47q-2 16 -2 20h-4l-3 -20q0 -1 -3.5 -18t-7.5 -29zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9t9 -23zM1661 -150 v-106h-288v106h75l-47 144h-243l-47 -144h75v-106h-287v106h70l230 662h162l230 -662h70zM1572 1001v-233h-584v90l369 529q12 18 21 27l11 9v3q-2 0 -6.5 -0.5t-7.5 -0.5q-12 -3 -30 -3h-232v-115h-120v229h567v-89l-369 -530q-6 -8 -21 -26l-11 -10v-3l14 3q9 1 30 1h248 v119h121z" />
+<glyph unicode="&#xf160;" horiz-adv-x="1792" d="M736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9t9 -23zM1792 -32v-192q0 -14 -9 -23t-23 -9h-832q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h832 q14 0 23 -9t9 -23zM1600 480v-192q0 -14 -9 -23t-23 -9h-640q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h640q14 0 23 -9t9 -23zM1408 992v-192q0 -14 -9 -23t-23 -9h-448q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h448q14 0 23 -9t9 -23zM1216 1504v-192q0 -14 -9 -23t-23 -9h-256 q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h256q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf161;" horiz-adv-x="1792" d="M1216 -32v-192q0 -14 -9 -23t-23 -9h-256q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h256q14 0 23 -9t9 -23zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192 q14 0 23 -9t9 -23zM1408 480v-192q0 -14 -9 -23t-23 -9h-448q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h448q14 0 23 -9t9 -23zM1600 992v-192q0 -14 -9 -23t-23 -9h-640q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h640q14 0 23 -9t9 -23zM1792 1504v-192q0 -14 -9 -23t-23 -9h-832 q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h832q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf162;" d="M1346 223q0 63 -44 116t-103 53q-52 0 -83 -37t-31 -94t36.5 -95t104.5 -38q50 0 85 27t35 68zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9t9 -23 zM1486 165q0 -62 -13 -121.5t-41 -114t-68 -95.5t-98.5 -65.5t-127.5 -24.5q-62 0 -108 16q-24 8 -42 15l39 113q15 -7 31 -11q37 -13 75 -13q84 0 134.5 58.5t66.5 145.5h-2q-21 -23 -61.5 -37t-84.5 -14q-106 0 -173 71.5t-67 172.5q0 105 72 178t181 73q123 0 205 -94.5 t82 -252.5zM1456 882v-114h-469v114h167v432q0 7 0.5 19t0.5 17v16h-2l-7 -12q-8 -13 -26 -31l-62 -58l-82 86l192 185h123v-654h165z" />
+<glyph unicode="&#xf163;" d="M1346 1247q0 63 -44 116t-103 53q-52 0 -83 -37t-31 -94t36.5 -95t104.5 -38q50 0 85 27t35 68zM736 96q0 -12 -10 -24l-319 -319q-10 -9 -23 -9q-12 0 -23 9l-320 320q-15 16 -7 35q8 20 30 20h192v1376q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1376h192q14 0 23 -9 t9 -23zM1456 -142v-114h-469v114h167v432q0 7 0.5 19t0.5 17v16h-2l-7 -12q-8 -13 -26 -31l-62 -58l-82 86l192 185h123v-654h165zM1486 1189q0 -62 -13 -121.5t-41 -114t-68 -95.5t-98.5 -65.5t-127.5 -24.5q-62 0 -108 16q-24 8 -42 15l39 113q15 -7 31 -11q37 -13 75 -13 q84 0 134.5 58.5t66.5 145.5h-2q-21 -23 -61.5 -37t-84.5 -14q-106 0 -173 71.5t-67 172.5q0 105 72 178t181 73q123 0 205 -94.5t82 -252.5z" />
+<glyph unicode="&#xf164;" horiz-adv-x="1664" d="M256 192q0 26 -19 45t-45 19q-27 0 -45.5 -19t-18.5 -45q0 -27 18.5 -45.5t45.5 -18.5q26 0 45 18.5t19 45.5zM416 704v-640q0 -26 -19 -45t-45 -19h-288q-26 0 -45 19t-19 45v640q0 26 19 45t45 19h288q26 0 45 -19t19 -45zM1600 704q0 -86 -55 -149q15 -44 15 -76 q3 -76 -43 -137q17 -56 0 -117q-15 -57 -54 -94q9 -112 -49 -181q-64 -76 -197 -78h-36h-76h-17q-66 0 -144 15.5t-121.5 29t-120.5 39.5q-123 43 -158 44q-26 1 -45 19.5t-19 44.5v641q0 25 18 43.5t43 20.5q24 2 76 59t101 121q68 87 101 120q18 18 31 48t17.5 48.5 t13.5 60.5q7 39 12.5 61t19.5 52t34 50q19 19 45 19q46 0 82.5 -10.5t60 -26t40 -40.5t24 -45t12 -50t5 -45t0.5 -39q0 -38 -9.5 -76t-19 -60t-27.5 -56q-3 -6 -10 -18t-11 -22t-8 -24h277q78 0 135 -57t57 -135z" />
+<glyph unicode="&#xf165;" horiz-adv-x="1664" d="M256 960q0 -26 -19 -45t-45 -19q-27 0 -45.5 19t-18.5 45q0 27 18.5 45.5t45.5 18.5q26 0 45 -18.5t19 -45.5zM416 448v640q0 26 -19 45t-45 19h-288q-26 0 -45 -19t-19 -45v-640q0 -26 19 -45t45 -19h288q26 0 45 19t19 45zM1545 597q55 -61 55 -149q-1 -78 -57.5 -135 t-134.5 -57h-277q4 -14 8 -24t11 -22t10 -18q18 -37 27 -57t19 -58.5t10 -76.5q0 -24 -0.5 -39t-5 -45t-12 -50t-24 -45t-40 -40.5t-60 -26t-82.5 -10.5q-26 0 -45 19q-20 20 -34 50t-19.5 52t-12.5 61q-9 42 -13.5 60.5t-17.5 48.5t-31 48q-33 33 -101 120q-49 64 -101 121 t-76 59q-25 2 -43 20.5t-18 43.5v641q0 26 19 44.5t45 19.5q35 1 158 44q77 26 120.5 39.5t121.5 29t144 15.5h17h76h36q133 -2 197 -78q58 -69 49 -181q39 -37 54 -94q17 -61 0 -117q46 -61 43 -137q0 -32 -15 -76z" />
+<glyph unicode="&#xf166;" d="M919 233v157q0 50 -29 50q-17 0 -33 -16v-224q16 -16 33 -16q29 0 29 49zM1103 355h66v34q0 51 -33 51t-33 -51v-34zM532 621v-70h-80v-423h-74v423h-78v70h232zM733 495v-367h-67v40q-39 -45 -76 -45q-33 0 -42 28q-6 16 -6 54v290h66v-270q0 -24 1 -26q1 -15 15 -15 q20 0 42 31v280h67zM985 384v-146q0 -52 -7 -73q-12 -42 -53 -42q-35 0 -68 41v-36h-67v493h67v-161q32 40 68 40q41 0 53 -42q7 -21 7 -74zM1236 255v-9q0 -29 -2 -43q-3 -22 -15 -40q-27 -40 -80 -40q-52 0 -81 38q-21 27 -21 86v129q0 59 20 86q29 38 80 38t78 -38 q21 -28 21 -86v-76h-133v-65q0 -51 34 -51q24 0 30 26q0 1 0.5 7t0.5 16.5v21.5h68zM785 1079v-156q0 -51 -32 -51t-32 51v156q0 52 32 52t32 -52zM1318 366q0 177 -19 260q-10 44 -43 73.5t-76 34.5q-136 15 -412 15q-275 0 -411 -15q-44 -5 -76.5 -34.5t-42.5 -73.5 q-20 -87 -20 -260q0 -176 20 -260q10 -43 42.5 -73t75.5 -35q137 -15 412 -15t412 15q43 5 75.5 35t42.5 73q20 84 20 260zM563 1017l90 296h-75l-51 -195l-53 195h-78l24 -69t23 -69q35 -103 46 -158v-201h74v201zM852 936v130q0 58 -21 87q-29 38 -78 38q-51 0 -78 -38 q-21 -29 -21 -87v-130q0 -58 21 -87q27 -38 78 -38q49 0 78 38q21 27 21 87zM1033 816h67v370h-67v-283q-22 -31 -42 -31q-15 0 -16 16q-1 2 -1 26v272h-67v-293q0 -37 6 -55q11 -27 43 -27q36 0 77 45v-40zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960 q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf167;" d="M971 292v-211q0 -67 -39 -67q-23 0 -45 22v301q22 22 45 22q39 0 39 -67zM1309 291v-46h-90v46q0 68 45 68t45 -68zM343 509h107v94h-312v-94h105v-569h100v569zM631 -60h89v494h-89v-378q-30 -42 -57 -42q-18 0 -21 21q-1 3 -1 35v364h-89v-391q0 -49 8 -73 q12 -37 58 -37q48 0 102 61v-54zM1060 88v197q0 73 -9 99q-17 56 -71 56q-50 0 -93 -54v217h-89v-663h89v48q45 -55 93 -55q54 0 71 55q9 27 9 100zM1398 98v13h-91q0 -51 -2 -61q-7 -36 -40 -36q-46 0 -46 69v87h179v103q0 79 -27 116q-39 51 -106 51q-68 0 -107 -51 q-28 -37 -28 -116v-173q0 -79 29 -116q39 -51 108 -51q72 0 108 53q18 27 21 54q2 9 2 58zM790 1011v210q0 69 -43 69t-43 -69v-210q0 -70 43 -70t43 70zM1509 260q0 -234 -26 -350q-14 -59 -58 -99t-102 -46q-184 -21 -555 -21t-555 21q-58 6 -102.5 46t-57.5 99 q-26 112 -26 350q0 234 26 350q14 59 58 99t103 47q183 20 554 20t555 -20q58 -7 102.5 -47t57.5 -99q26 -112 26 -350zM511 1536h102l-121 -399v-271h-100v271q-14 74 -61 212q-37 103 -65 187h106l71 -263zM881 1203v-175q0 -81 -28 -118q-37 -51 -106 -51q-67 0 -105 51 q-28 38 -28 118v175q0 80 28 117q38 51 105 51q69 0 106 -51q28 -37 28 -117zM1216 1365v-499h-91v55q-53 -62 -103 -62q-46 0 -59 37q-8 24 -8 75v394h91v-367q0 -33 1 -35q3 -22 21 -22q27 0 57 43v381h91z" />
+<glyph unicode="&#xf168;" horiz-adv-x="1408" d="M597 869q-10 -18 -257 -456q-27 -46 -65 -46h-239q-21 0 -31 17t0 36l253 448q1 0 0 1l-161 279q-12 22 -1 37q9 15 32 15h239q40 0 66 -45zM1403 1511q11 -16 0 -37l-528 -934v-1l336 -615q11 -20 1 -37q-10 -15 -32 -15h-239q-42 0 -66 45l-339 622q18 32 531 942 q25 45 64 45h241q22 0 31 -15z" />
+<glyph unicode="&#xf169;" d="M685 771q0 1 -126 222q-21 34 -52 34h-184q-18 0 -26 -11q-7 -12 1 -29l125 -216v-1l-196 -346q-9 -14 0 -28q8 -13 24 -13h185q31 0 50 36zM1309 1268q-7 12 -24 12h-187q-30 0 -49 -35l-411 -729q1 -2 262 -481q20 -35 52 -35h184q18 0 25 12q8 13 -1 28l-260 476v1 l409 723q8 16 0 28zM1536 1120v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf16a;" horiz-adv-x="1792" d="M1280 640q0 37 -30 54l-512 320q-31 20 -65 2q-33 -18 -33 -56v-640q0 -38 33 -56q16 -8 31 -8q20 0 34 10l512 320q30 17 30 54zM1792 640q0 -96 -1 -150t-8.5 -136.5t-22.5 -147.5q-16 -73 -69 -123t-124 -58q-222 -25 -671 -25t-671 25q-71 8 -124.5 58t-69.5 123 q-14 65 -21.5 147.5t-8.5 136.5t-1 150t1 150t8.5 136.5t22.5 147.5q16 73 69 123t124 58q222 25 671 25t671 -25q71 -8 124.5 -58t69.5 -123q14 -65 21.5 -147.5t8.5 -136.5t1 -150z" />
+<glyph unicode="&#xf16b;" horiz-adv-x="1792" d="M402 829l494 -305l-342 -285l-490 319zM1388 274v-108l-490 -293v-1l-1 1l-1 -1v1l-489 293v108l147 -96l342 284v2l1 -1l1 1v-2l343 -284zM554 1418l342 -285l-494 -304l-338 270zM1390 829l338 -271l-489 -319l-343 285zM1239 1418l489 -319l-338 -270l-494 304z" />
+<glyph unicode="&#xf16c;" horiz-adv-x="1408" d="M928 135v-151l-707 -1v151zM1169 481v-701l-1 -35v-1h-1132l-35 1h-1v736h121v-618h928v618h120zM241 393l704 -65l-13 -150l-705 65zM309 709l683 -183l-39 -146l-683 183zM472 1058l609 -360l-77 -130l-609 360zM832 1389l398 -585l-124 -85l-399 584zM1285 1536 l121 -697l-149 -26l-121 697z" />
+<glyph unicode="&#xf16d;" d="M1362 110v648h-135q20 -63 20 -131q0 -126 -64 -232.5t-174 -168.5t-240 -62q-197 0 -337 135.5t-140 327.5q0 68 20 131h-141v-648q0 -26 17.5 -43.5t43.5 -17.5h1069q25 0 43 17.5t18 43.5zM1078 643q0 124 -90.5 211.5t-218.5 87.5q-127 0 -217.5 -87.5t-90.5 -211.5 t90.5 -211.5t217.5 -87.5q128 0 218.5 87.5t90.5 211.5zM1362 1003v165q0 28 -20 48.5t-49 20.5h-174q-29 0 -49 -20.5t-20 -48.5v-165q0 -29 20 -49t49 -20h174q29 0 49 20t20 49zM1536 1211v-1142q0 -81 -58 -139t-139 -58h-1142q-81 0 -139 58t-58 139v1142q0 81 58 139 t139 58h1142q81 0 139 -58t58 -139z" />
+<glyph unicode="&#xf16e;" d="M1248 1408q119 0 203.5 -84.5t84.5 -203.5v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960zM698 640q0 88 -62 150t-150 62t-150 -62t-62 -150t62 -150t150 -62t150 62t62 150zM1262 640q0 88 -62 150 t-150 62t-150 -62t-62 -150t62 -150t150 -62t150 62t62 150z" />
+<glyph unicode="&#xf170;" d="M768 914l201 -306h-402zM1133 384h94l-459 691l-459 -691h94l104 160h522zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf171;" horiz-adv-x="1408" d="M815 677q8 -63 -50.5 -101t-111.5 -6q-39 17 -53.5 58t-0.5 82t52 58q36 18 72.5 12t64 -35.5t27.5 -67.5zM926 698q-14 107 -113 164t-197 13q-63 -28 -100.5 -88.5t-34.5 -129.5q4 -91 77.5 -155t165.5 -56q91 8 152 84t50 168zM1165 1240q-20 27 -56 44.5t-58 22 t-71 12.5q-291 47 -566 -2q-43 -7 -66 -12t-55 -22t-50 -43q30 -28 76 -45.5t73.5 -22t87.5 -11.5q228 -29 448 -1q63 8 89.5 12t72.5 21.5t75 46.5zM1222 205q-8 -26 -15.5 -76.5t-14 -84t-28.5 -70t-58 -56.5q-86 -48 -189.5 -71.5t-202 -22t-201.5 18.5q-46 8 -81.5 18 t-76.5 27t-73 43.5t-52 61.5q-25 96 -57 292l6 16l18 9q223 -148 506.5 -148t507.5 148q21 -6 24 -23t-5 -45t-8 -37zM1403 1166q-26 -167 -111 -655q-5 -30 -27 -56t-43.5 -40t-54.5 -31q-252 -126 -610 -88q-248 27 -394 139q-15 12 -25.5 26.5t-17 35t-9 34t-6 39.5 t-5.5 35q-9 50 -26.5 150t-28 161.5t-23.5 147.5t-22 158q3 26 17.5 48.5t31.5 37.5t45 30t46 22.5t48 18.5q125 46 313 64q379 37 676 -50q155 -46 215 -122q16 -20 16.5 -51t-5.5 -54z" />
+<glyph unicode="&#xf172;" d="M848 666q0 43 -41 66t-77 1q-43 -20 -42.5 -72.5t43.5 -70.5q39 -23 81 4t36 72zM928 682q8 -66 -36 -121t-110 -61t-119 40t-56 113q-2 49 25.5 93t72.5 64q70 31 141.5 -10t81.5 -118zM1100 1073q-20 -21 -53.5 -34t-53 -16t-63.5 -8q-155 -20 -324 0q-44 6 -63 9.5 t-52.5 16t-54.5 32.5q13 19 36 31t40 15.5t47 8.5q198 35 408 1q33 -5 51 -8.5t43 -16t39 -31.5zM1142 327q0 7 5.5 26.5t3 32t-17.5 16.5q-161 -106 -365 -106t-366 106l-12 -6l-5 -12q26 -154 41 -210q47 -81 204 -108q249 -46 428 53q34 19 49 51.5t22.5 85.5t12.5 71z M1272 1020q9 53 -8 75q-43 55 -155 88q-216 63 -487 36q-132 -12 -226 -46q-38 -15 -59.5 -25t-47 -34t-29.5 -54q8 -68 19 -138t29 -171t24 -137q1 -5 5 -31t7 -36t12 -27t22 -28q105 -80 284 -100q259 -28 440 63q24 13 39.5 23t31 29t19.5 40q48 267 80 473zM1536 1120 v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf173;" horiz-adv-x="1024" d="M390 1408h219v-388h364v-241h-364v-394q0 -136 14 -172q13 -37 52 -60q50 -31 117 -31q117 0 232 76v-242q-102 -48 -178 -65q-77 -19 -173 -19q-105 0 -186 27q-78 25 -138 75q-58 51 -79 105q-22 54 -22 161v539h-170v217q91 30 155 84q64 55 103 132q39 78 54 196z " />
+<glyph unicode="&#xf174;" d="M1123 127v181q-88 -56 -174 -56q-51 0 -88 23q-29 17 -39 45q-11 30 -11 129v295h274v181h-274v291h-164q-11 -90 -40 -147t-78 -99q-48 -40 -116 -63v-163h127v-404q0 -78 17 -121q17 -42 59 -78q43 -37 104 -57q62 -20 140 -20q67 0 129 14q57 13 134 49zM1536 1120 v-960q0 -119 -84.5 -203.5t-203.5 -84.5h-960q-119 0 -203.5 84.5t-84.5 203.5v960q0 119 84.5 203.5t203.5 84.5h960q119 0 203.5 -84.5t84.5 -203.5z" />
+<glyph unicode="&#xf175;" horiz-adv-x="768" d="M765 237q8 -19 -5 -35l-350 -384q-10 -10 -23 -10q-14 0 -24 10l-355 384q-13 16 -5 35q9 19 29 19h224v1248q0 14 9 23t23 9h192q14 0 23 -9t9 -23v-1248h224q21 0 29 -19z" />
+<glyph unicode="&#xf176;" horiz-adv-x="768" d="M765 1043q-9 -19 -29 -19h-224v-1248q0 -14 -9 -23t-23 -9h-192q-14 0 -23 9t-9 23v1248h-224q-21 0 -29 19t5 35l350 384q10 10 23 10q14 0 24 -10l355 -384q13 -16 5 -35z" />
+<glyph unicode="&#xf177;" horiz-adv-x="1792" d="M1792 736v-192q0 -14 -9 -23t-23 -9h-1248v-224q0 -21 -19 -29t-35 5l-384 350q-10 10 -10 23q0 14 10 24l384 354q16 14 35 6q19 -9 19 -29v-224h1248q14 0 23 -9t9 -23z" />
+<glyph unicode="&#xf178;" horiz-adv-x="1792" d="M1728 643q0 -14 -10 -24l-384 -354q-16 -14 -35 -6q-19 9 -19 29v224h-1248q-14 0 -23 9t-9 23v192q0 14 9 23t23 9h1248v224q0 21 19 29t35 -5l384 -350q10 -10 10 -23z" />
+<glyph unicode="&#xf179;" horiz-adv-x="1408" d="M1393 321q-39 -125 -123 -250q-129 -196 -257 -196q-49 0 -140 32q-86 32 -151 32q-61 0 -142 -33q-81 -34 -132 -34q-152 0 -301 259q-147 261 -147 503q0 228 113 374q112 144 284 144q72 0 177 -30q104 -30 138 -30q45 0 143 34q102 34 173 34q119 0 213 -65 q52 -36 104 -100q-79 -67 -114 -118q-65 -94 -65 -207q0 -124 69 -223t158 -126zM1017 1494q0 -61 -29 -136q-30 -75 -93 -138q-54 -54 -108 -72q-37 -11 -104 -17q3 149 78 257q74 107 250 148q1 -3 2.5 -11t2.5 -11q0 -4 0.5 -10t0.5 -10z" />
+<glyph unicode="&#xf17a;" horiz-adv-x="1664" d="M682 530v-651l-682 94v557h682zM682 1273v-659h-682v565zM1664 530v-786l-907 125v661h907zM1664 1408v-794h-907v669z" />
+<glyph unicode="&#xf17b;" horiz-adv-x="1408" d="M493 1053q16 0 27.5 11.5t11.5 27.5t-11.5 27.5t-27.5 11.5t-27 -11.5t-11 -27.5t11 -27.5t27 -11.5zM915 1053q16 0 27 11.5t11 27.5t-11 27.5t-27 11.5t-27.5 -11.5t-11.5 -27.5t11.5 -27.5t27.5 -11.5zM103 869q42 0 72 -30t30 -72v-430q0 -43 -29.5 -73t-72.5 -30 t-73 30t-30 73v430q0 42 30 72t73 30zM1163 850v-666q0 -46 -32 -78t-77 -32h-75v-227q0 -43 -30 -73t-73 -30t-73 30t-30 73v227h-138v-227q0 -43 -30 -73t-73 -30q-42 0 -72 30t-30 73l-1 227h-74q-46 0 -78 32t-32 78v666h918zM931 1255q107 -55 171 -153.5t64 -215.5 h-925q0 117 64 215.5t172 153.5l-71 131q-7 13 5 20q13 6 20 -6l72 -132q95 42 201 42t201 -42l72 132q7 12 20 6q12 -7 5 -20zM1408 767v-430q0 -43 -30 -73t-73 -30q-42 0 -72 30t-30 73v430q0 43 30 72.5t72 29.5q43 0 73 -29.5t30 -72.5z" />
+<glyph unicode="&#xf17c;" d="M663 1125q-11 -1 -15.5 -10.5t-8.5 -9.5q-5 -1 -5 5q0 12 19 15h10zM750 1111q-4 -1 -11.5 6.5t-17.5 4.5q24 11 32 -2q3 -6 -3 -9zM399 684q-4 1 -6 -3t-4.5 -12.5t-5.5 -13.5t-10 -13q-7 -10 -1 -12q4 -1 12.5 7t12.5 18q1 3 2 7t2 6t1.5 4.5t0.5 4v3t-1 2.5t-3 2z M1254 325q0 18 -55 42q4 15 7.5 27.5t5 26t3 21.5t0.5 22.5t-1 19.5t-3.5 22t-4 20.5t-5 25t-5.5 26.5q-10 48 -47 103t-72 75q24 -20 57 -83q87 -162 54 -278q-11 -40 -50 -42q-31 -4 -38.5 18.5t-8 83.5t-11.5 107q-9 39 -19.5 69t-19.5 45.5t-15.5 24.5t-13 15t-7.5 7 q-14 62 -31 103t-29.5 56t-23.5 33t-15 40q-4 21 6 53.5t4.5 49.5t-44.5 25q-15 3 -44.5 18t-35.5 16q-8 1 -11 26t8 51t36 27q37 3 51 -30t4 -58q-11 -19 -2 -26.5t30 -0.5q13 4 13 36v37q-5 30 -13.5 50t-21 30.5t-23.5 15t-27 7.5q-107 -8 -89 -134q0 -15 -1 -15 q-9 9 -29.5 10.5t-33 -0.5t-15.5 5q1 57 -16 90t-45 34q-27 1 -41.5 -27.5t-16.5 -59.5q-1 -15 3.5 -37t13 -37.5t15.5 -13.5q10 3 16 14q4 9 -7 8q-7 0 -15.5 14.5t-9.5 33.5q-1 22 9 37t34 14q17 0 27 -21t9.5 -39t-1.5 -22q-22 -15 -31 -29q-8 -12 -27.5 -23.5 t-20.5 -12.5q-13 -14 -15.5 -27t7.5 -18q14 -8 25 -19.5t16 -19t18.5 -13t35.5 -6.5q47 -2 102 15q2 1 23 7t34.5 10.5t29.5 13t21 17.5q9 14 20 8q5 -3 6.5 -8.5t-3 -12t-16.5 -9.5q-20 -6 -56.5 -21.5t-45.5 -19.5q-44 -19 -70 -23q-25 -5 -79 2q-10 2 -9 -2t17 -19 q25 -23 67 -22q17 1 36 7t36 14t33.5 17.5t30 17t24.5 12t17.5 2.5t8.5 -11q0 -2 -1 -4.5t-4 -5t-6 -4.5t-8.5 -5t-9 -4.5t-10 -5t-9.5 -4.5q-28 -14 -67.5 -44t-66.5 -43t-49 -1q-21 11 -63 73q-22 31 -25 22q-1 -3 -1 -10q0 -25 -15 -56.5t-29.5 -55.5t-21 -58t11.5 -63 q-23 -6 -62.5 -90t-47.5 -141q-2 -18 -1.5 -69t-5.5 -59q-8 -24 -29 -3q-32 31 -36 94q-2 28 4 56q4 19 -1 18l-4 -5q-36 -65 10 -166q5 -12 25 -28t24 -20q20 -23 104 -90.5t93 -76.5q16 -15 17.5 -38t-14 -43t-45.5 -23q8 -15 29 -44.5t28 -54t7 -70.5q46 24 7 92 q-4 8 -10.5 16t-9.5 12t-2 6q3 5 13 9.5t20 -2.5q46 -52 166 -36q133 15 177 87q23 38 34 30q12 -6 10 -52q-1 -25 -23 -92q-9 -23 -6 -37.5t24 -15.5q3 19 14.5 77t13.5 90q2 21 -6.5 73.5t-7.5 97t23 70.5q15 18 51 18q1 37 34.5 53t72.5 10.5t60 -22.5zM626 1152 q3 17 -2.5 30t-11.5 15q-9 2 -9 -7q2 -5 5 -6q10 0 7 -15q-3 -20 8 -20q3 0 3 3zM1045 955q-2 8 -6.5 11.5t-13 5t-14.5 5.5q-5 3 -9.5 8t-7 8t-5.5 6.5t-4 4t-4 -1.5q-14 -16 7 -43.5t39 -31.5q9 -1 14.5 8t3.5 20zM867 1168q0 11 -5 19.5t-11 12.5t-9 3q-14 -1 -7 -7l4 -2 q14 -4 18 -31q0 -3 8 2zM921 1401q0 2 -2.5 5t-9 7t-9.5 6q-15 15 -24 15q-9 -1 -11.5 -7.5t-1 -13t-0.5 -12.5q-1 -4 -6 -10.5t-6 -9t3 -8.5q4 -3 8 0t11 9t15 9q1 1 9 1t15 2t9 7zM1486 60q20 -12 31 -24.5t12 -24t-2.5 -22.5t-15.5 -22t-23.5 -19.5t-30 -18.5 t-31.5 -16.5t-32 -15.5t-27 -13q-38 -19 -85.5 -56t-75.5 -64q-17 -16 -68 -19.5t-89 14.5q-18 9 -29.5 23.5t-16.5 25.5t-22 19.5t-47 9.5q-44 1 -130 1q-19 0 -57 -1.5t-58 -2.5q-44 -1 -79.5 -15t-53.5 -30t-43.5 -28.5t-53.5 -11.5q-29 1 -111 31t-146 43q-19 4 -51 9.5 t-50 9t-39.5 9.5t-33.5 14.5t-17 19.5q-10 23 7 66.5t18 54.5q1 16 -4 40t-10 42.5t-4.5 36.5t10.5 27q14 12 57 14t60 12q30 18 42 35t12 51q21 -73 -32 -106q-32 -20 -83 -15q-34 3 -43 -10q-13 -15 5 -57q2 -6 8 -18t8.5 -18t4.5 -17t1 -22q0 -15 -17 -49t-14 -48 q3 -17 37 -26q20 -6 84.5 -18.5t99.5 -20.5q24 -6 74 -22t82.5 -23t55.5 -4q43 6 64.5 28t23 48t-7.5 58.5t-19 52t-20 36.5q-121 190 -169 242q-68 74 -113 40q-11 -9 -15 15q-3 16 -2 38q1 29 10 52t24 47t22 42q8 21 26.5 72t29.5 78t30 61t39 54q110 143 124 195 q-12 112 -16 310q-2 90 24 151.5t106 104.5q39 21 104 21q53 1 106 -13.5t89 -41.5q57 -42 91.5 -121.5t29.5 -147.5q-5 -95 30 -214q34 -113 133 -218q55 -59 99.5 -163t59.5 -191q8 -49 5 -84.5t-12 -55.5t-20 -22q-10 -2 -23.5 -19t-27 -35.5t-40.5 -33.5t-61 -14 q-18 1 -31.5 5t-22.5 13.5t-13.5 15.5t-11.5 20.5t-9 19.5q-22 37 -41 30t-28 -49t7 -97q20 -70 1 -195q-10 -65 18 -100.5t73 -33t85 35.5q59 49 89.5 66.5t103.5 42.5q53 18 77 36.5t18.5 34.5t-25 28.5t-51.5 23.5q-33 11 -49.5 48t-15 72.5t15.5 47.5q1 -31 8 -56.5 t14.5 -40.5t20.5 -28.5t21 -19t21.5 -13t16.5 -9.5z" />
+<glyph unicode="&#xf17d;" d="M1024 36q-42 241 -140 498h-2l-2 -1q-16 -6 -43 -16.5t-101 -49t-137 -82t-131 -114.5t-103 -148l-15 11q184 -150 418 -150q132 0 256 52zM839 643q-21 49 -53 111q-311 -93 -673 -93q-1 -7 -1 -21q0 -124 44 -236.5t124 -201.5q50 89 123.5 166.5t142.5 124.5t130.5 81 t99.5 48l37 13q4 1 13 3.5t13 4.5zM732 855q-120 213 -244 378q-138 -65 -234 -186t-128 -272q302 0 606 80zM1416 536q-210 60 -409 29q87 -239 128 -469q111 75 185 189.5t96 250.5zM611 1277q-1 0 -2 -1q1 1 2 1zM1201 1132q-185 164 -433 164q-76 0 -155 -19 q131 -170 246 -382q69 26 130 60.5t96.5 61.5t65.5 57t37.5 40.5zM1424 647q-3 232 -149 410l-1 -1q-9 -12 -19 -24.5t-43.5 -44.5t-71 -60.5t-100 -65t-131.5 -64.5q25 -53 44 -95q2 -6 6.5 -17.5t7.5 -16.5q36 5 74.5 7t73.5 2t69 -1.5t64 -4t56.5 -5.5t48 -6.5t36.5 -6 t25 -4.5zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf17e;" d="M1173 473q0 50 -19.5 91.5t-48.5 68.5t-73 49t-82.5 34t-87.5 23l-104 24q-30 7 -44 10.5t-35 11.5t-30 16t-16.5 21t-7.5 30q0 77 144 77q43 0 77 -12t54 -28.5t38 -33.5t40 -29t48 -12q47 0 75.5 32t28.5 77q0 55 -56 99.5t-142 67.5t-182 23q-68 0 -132 -15.5 t-119.5 -47t-89 -87t-33.5 -128.5q0 -61 19 -106.5t56 -75.5t80 -48.5t103 -32.5l146 -36q90 -22 112 -36q32 -20 32 -60q0 -39 -40 -64.5t-105 -25.5q-51 0 -91.5 16t-65 38.5t-45.5 45t-46 38.5t-54 16q-50 0 -75.5 -30t-25.5 -75q0 -92 122 -157.5t291 -65.5 q73 0 140 18.5t122.5 53.5t88.5 93.5t33 131.5zM1536 256q0 -159 -112.5 -271.5t-271.5 -112.5q-130 0 -234 80q-77 -16 -150 -16q-143 0 -273.5 55.5t-225 150t-150 225t-55.5 273.5q0 73 16 150q-80 104 -80 234q0 159 112.5 271.5t271.5 112.5q130 0 234 -80 q77 16 150 16q143 0 273.5 -55.5t225 -150t150 -225t55.5 -273.5q0 -73 -16 -150q80 -104 80 -234z" />
+<glyph unicode="&#xf180;" horiz-adv-x="1664" d="M1483 512l-587 -587q-52 -53 -127.5 -53t-128.5 53l-587 587q-53 53 -53 128t53 128l587 587q53 53 128 53t128 -53l265 -265l-398 -399l-188 188q-42 42 -99 42q-59 0 -100 -41l-120 -121q-42 -40 -42 -99q0 -58 42 -100l406 -408q30 -28 67 -37l6 -4h28q60 0 99 41 l619 619l2 -3q53 -53 53 -128t-53 -128zM1406 1138l120 -120q14 -15 14 -36t-14 -36l-730 -730q-17 -15 -37 -15v0q-4 0 -6 1q-18 2 -30 14l-407 408q-14 15 -14 36t14 35l121 120q13 15 35 15t36 -15l252 -252l574 575q15 15 36 15t36 -15z" />
+<glyph unicode="&#xf181;" d="M704 192v1024q0 14 -9 23t-23 9h-480q-14 0 -23 -9t-9 -23v-1024q0 -14 9 -23t23 -9h480q14 0 23 9t9 23zM1376 576v640q0 14 -9 23t-23 9h-480q-14 0 -23 -9t-9 -23v-640q0 -14 9 -23t23 -9h480q14 0 23 9t9 23zM1536 1344v-1408q0 -26 -19 -45t-45 -19h-1408 q-26 0 -45 19t-19 45v1408q0 26 19 45t45 19h1408q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf182;" horiz-adv-x="1280" d="M1280 480q0 -40 -28 -68t-68 -28q-51 0 -80 43l-227 341h-45v-132l247 -411q9 -15 9 -33q0 -26 -19 -45t-45 -19h-192v-272q0 -46 -33 -79t-79 -33h-160q-46 0 -79 33t-33 79v272h-192q-26 0 -45 19t-19 45q0 18 9 33l247 411v132h-45l-227 -341q-29 -43 -80 -43 q-40 0 -68 28t-28 68q0 29 16 53l256 384q73 107 176 107h384q103 0 176 -107l256 -384q16 -24 16 -53zM864 1280q0 -93 -65.5 -158.5t-158.5 -65.5t-158.5 65.5t-65.5 158.5t65.5 158.5t158.5 65.5t158.5 -65.5t65.5 -158.5z" />
+<glyph unicode="&#xf183;" horiz-adv-x="1024" d="M1024 832v-416q0 -40 -28 -68t-68 -28t-68 28t-28 68v352h-64v-912q0 -46 -33 -79t-79 -33t-79 33t-33 79v464h-64v-464q0 -46 -33 -79t-79 -33t-79 33t-33 79v912h-64v-352q0 -40 -28 -68t-68 -28t-68 28t-28 68v416q0 80 56 136t136 56h640q80 0 136 -56t56 -136z M736 1280q0 -93 -65.5 -158.5t-158.5 -65.5t-158.5 65.5t-65.5 158.5t65.5 158.5t158.5 65.5t158.5 -65.5t65.5 -158.5z" />
+<glyph unicode="&#xf184;" d="M773 234l350 473q16 22 24.5 59t-6 85t-61.5 79q-40 26 -83 25.5t-73.5 -17.5t-54.5 -45q-36 -40 -96 -40q-59 0 -95 40q-24 28 -54.5 45t-73.5 17.5t-84 -25.5q-46 -31 -60.5 -79t-6 -85t24.5 -59zM1536 640q0 -209 -103 -385.5t-279.5 -279.5t-385.5 -103t-385.5 103 t-279.5 279.5t-103 385.5t103 385.5t279.5 279.5t385.5 103t385.5 -103t279.5 -279.5t103 -385.5z" />
+<glyph unicode="&#xf185;" horiz-adv-x="1792" d="M1472 640q0 117 -45.5 223.5t-123 184t-184 123t-223.5 45.5t-223.5 -45.5t-184 -123t-123 -184t-45.5 -223.5t45.5 -223.5t123 -184t184 -123t223.5 -45.5t223.5 45.5t184 123t123 184t45.5 223.5zM1748 363q-4 -15 -20 -20l-292 -96v-306q0 -16 -13 -26q-15 -10 -29 -4 l-292 94l-180 -248q-10 -13 -26 -13t-26 13l-180 248l-292 -94q-14 -6 -29 4q-13 10 -13 26v306l-292 96q-16 5 -20 20q-5 17 4 29l180 248l-180 248q-9 13 -4 29q4 15 20 20l292 96v306q0 16 13 26q15 10 29 4l292 -94l180 248q9 12 26 12t26 -12l180 -248l292 94 q14 6 29 -4q13 -10 13 -26v-306l292 -96q16 -5 20 -20q5 -16 -4 -29l-180 -248l180 -248q9 -12 4 -29z" />
+<glyph unicode="&#xf186;" d="M1262 233q-54 -9 -110 -9q-182 0 -337 90t-245 245t-90 337q0 192 104 357q-201 -60 -328.5 -229t-127.5 -384q0 -130 51 -248.5t136.5 -204t204 -136.5t248.5 -51q144 0 273.5 61.5t220.5 171.5zM1465 318q-94 -203 -283.5 -324.5t-413.5 -121.5q-156 0 -298 61 t-245 164t-164 245t-61 298q0 153 57.5 292.5t156 241.5t235.5 164.5t290 68.5q44 2 61 -39q18 -41 -15 -72q-86 -78 -131.5 -181.5t-45.5 -218.5q0 -148 73 -273t198 -198t273 -73q118 0 228 51q41 18 72 -13q14 -14 17.5 -34t-4.5 -38z" />
+<glyph unicode="&#xf187;" horiz-adv-x="1792" d="M1088 704q0 26 -19 45t-45 19h-256q-26 0 -45 -19t-19 -45t19 -45t45 -19h256q26 0 45 19t19 45zM1664 896v-960q0 -26 -19 -45t-45 -19h-1408q-26 0 -45 19t-19 45v960q0 26 19 45t45 19h1408q26 0 45 -19t19 -45zM1728 1344v-256q0 -26 -19 -45t-45 -19h-1536 q-26 0 -45 19t-19 45v256q0 26 19 45t45 19h1536q26 0 45 -19t19 -45z" />
+<glyph unicode="&#xf188;" horiz-adv-x="1664" d="M1632 576q0 -26 -19 -45t-45 -19h-224q0 -171 -67 -290l208 -209q19 -19 19 -45t-19 -45q-18 -19 -45 -19t-45 19l-198 197q-5 -5 -15 -13t-42 -28.5t-65 -36.5t-82 -29t-97 -13v896h-128v-896q-51 0 -101.5 13.5t-87 33t-66 39t-43.5 32.5l-15 14l-183 -207 q-20 -21 -48 -21q-24 0 -43 16q-19 18 -20.5 44.5t15.5 46.5l202 227q-58 114 -58 274h-224q-26 0 -45 19t-19 45t19 45t45 19h224v294l-173 173q-19 19 -19 45t19 45t45 19t45 -19l173 -173h844l173 173q19 19 45 19t45 -19t19 -45t-19 -45l-173 -173v-294h224q26 0 45 -19 t19 -45zM1152 1152h-640q0 133 93.5 226.5t226.5 93.5t226.5 -93.5t93.5 -226.5z" />
+<glyph unicode="&#xf189;" horiz-adv-x="1920" d="M1917 1016q23 -64 -150 -294q-24 -32 -65 -85q-78 -100 -90 -131q-17 -41 14 -81q17 -21 81 -82h1l1 -1l1 -1l2 -2q141 -131 191 -221q3 -5 6.5 -12.5t7 -26.5t-0.5 -34t-25 -27.5t-59 -12.5l-256 -4q-24 -5 -56 5t-52 22l-20 12q-30 21 -70 64t-68.5 77.5t-61 58 t-56.5 15.5q-3 -1 -8 -3.5t-17 -14.5t-21.5 -29.5t-17 -52t-6.5 -77.5q0 -15 -3.5 -27.5t-7.5 -18.5l-4 -5q-18 -19 -53 -22h-115q-71 -4 -146 16.5t-131.5 53t-103 66t-70.5 57.5l-25 24q-10 10 -27.5 30t-71.5 91t-106 151t-122.5 211t-130.5 272q-6 16 -6 27t3 16l4 6 q15 19 57 19l274 2q12 -2 23 -6.5t16 -8.5l5 -3q16 -11 24 -32q20 -50 46 -103.5t41 -81.5l16 -29q29 -60 56 -104t48.5 -68.5t41.5 -38.5t34 -14t27 5q2 1 5 5t12 22t13.5 47t9.5 81t0 125q-2 40 -9 73t-14 46l-6 12q-25 34 -85 43q-13 2 5 24q17 19 38 30q53 26 239 24 q82 -1 135 -13q20 -5 33.5 -13.5t20.5 -24t10.5 -32t3.5 -45.5t-1 -55t-2.5 -70.5t-1.5 -82.5q0 -11 -1 -42t-0.5 -48t3.5 -40.5t11.5 -39t22.5 -24.5q8 -2 17 -4t26 11t38 34.5t52 67t68 107.5q60 104 107 225q4 10 10 17.5t11 10.5l4 3l5 2.5t13 3t20 0.5l288 2 q39 5 64 -2.5t31 -16.5z" />
+<glyph unicode="&#xf18a;" horiz-adv-x="1792" d="M675 252q21 34 11 69t-45 50q-34 14 -73 1t-60 -46q-22 -34 -13 -68.5t43 -50.5t74.5 -2.5t62.5 47.5zM769 373q8 13 3.5 26.5t-17.5 18.5q-14 5 -28.5 -0.5t-21.5 -18.5q-17 -31 13 -45q14 -5 29 0.5t22 18.5zM943 266q-45 -102 -158 -150t-224 -12 q-107 34 -147.5 126.5t6.5 187.5q47 93 151.5 139t210.5 19q111 -29 158.5 -119.5t2.5 -190.5zM1255 426q-9 96 -89 170t-208.5 109t-274.5 21q-223 -23 -369.5 -141.5t-132.5 -264.5q9 -96 89 -170t208.5 -109t274.5 -21q223 23 369.5 141.5t132.5 264.5zM1563 422 q0 -68 -37 -139.5t-109 -137t-168.5 -117.5t-226 -83t-270.5 -31t-275 33.5t-240.5 93t-171.5 151t-65 199.5q0 115 69.5 245t197.5 258q169 169 341.5 236t246.5 -7q65 -64 20 -209q-4 -14 -1 -20t10 -7t14.5 0.5t13.5 3.5l6 2q139 59 246 59t153 -61q45 -63 0 -178 q-2 -13 -4.5 -20t4.5 -12.5t12 -7.5t17 -6q57 -18 103 -47t80 -81.5t34 -116.5zM1489 1046q42 -47 54.5 -108.5t-6.5 -117.5q-8 -23 -29.5 -34t-44.5 -4q-23 8 -34 29.5t-4 44.5q20 63 -24 111t-107 35q-24 -5 -45 8t-25 37q-5 24 8 44.5t37 25.5q60 13 119 -5.5t101 -65.5z M1670 1209q87 -96 112.5 -222.5t-13.5 -241.5q-9 -27 -34 -40t-52 -4t-40 34t-5 52q28 82 10 172t-80 158q-62 69 -148 95.5t-173 8.5q-28 -6 -52 9.5t-30 43.5t9.5 51.5t43.5 29.5q123 26 244 -11.5t208 -134.5z" />
+<glyph unicode="&#xf18b;" horiz-adv-x="1920" d="M805 163q-122 -67 -261 -67q-141 0 -261 67q98 61 167 149t94 191q25 -103 94 -191t167 -149zM453 1176v-344q0 -179 -89.5 -326t-234.5 -217q-129 152 -129 351q0 200 129.5 352t323.5 184zM958 991q-128 -152 -128 -351q0 -201 128 -351q-145 70 -234.5 218t-89.5 328 v341q196 -33 324 -185zM1638 163q-122 -67 -261 -67q-141 0 -261 67q98 61 167 149t94 191q25 -103 94 -191t167 -149zM1286 1176v-344q0 -179 -91 -326t-237 -217v0q133 154 133 351q0 195 -133 351q129 151 328 185zM1920 640q0 -201 -129 -351q-145 70 -234.5 218 t-89.5 328v341q194 -32 323.5 -184t129.5 -352z" />
+<glyph unicode="&#xf18c;" horiz-adv-x="1792" />
+<glyph unicode="&#xf18d;" horiz-adv-x="1792" />
+<glyph unicode="&#xf18e;" horiz-adv-x="1792" />
+<glyph unicode="&#xf500;" horiz-adv-x="1792" />
+</font>
+</defs></svg> 
\ No newline at end of file
diff --git a/docs/fonts/fontawesome-webfont.ttf b/docs/fonts/fontawesome-webfont.ttf
index 26dea79..d365924 100644
Binary files a/docs/fonts/fontawesome-webfont.ttf and b/docs/fonts/fontawesome-webfont.ttf differ
diff --git a/docs/fonts/fontawesome-webfont.woff b/docs/fonts/fontawesome-webfont.woff
index dc35ce3..b9bd17e 100644
Binary files a/docs/fonts/fontawesome-webfont.woff and b/docs/fonts/fontawesome-webfont.woff differ
diff --git a/docs/img/act.png b/docs/img/act.png
new file mode 100644
index 0000000..7522c60
Binary files /dev/null and b/docs/img/act.png differ
diff --git a/docs/img/filters.png b/docs/img/filters.png
new file mode 100644
index 0000000..da4306c
Binary files /dev/null and b/docs/img/filters.png differ
diff --git a/docs/img/graph.png b/docs/img/graph.png
new file mode 100644
index 0000000..3e93dcd
Binary files /dev/null and b/docs/img/graph.png differ
diff --git a/docs/img/improve.png b/docs/img/improve.png
new file mode 100644
index 0000000..8942665
Binary files /dev/null and b/docs/img/improve.png differ
diff --git a/docs/img/level.png b/docs/img/level.png
new file mode 100644
index 0000000..29ae66f
Binary files /dev/null and b/docs/img/level.png differ
diff --git a/docs/img/observe.png b/docs/img/observe.png
new file mode 100644
index 0000000..1757132
Binary files /dev/null and b/docs/img/observe.png differ
diff --git a/docs/img/train.png b/docs/img/train.png
new file mode 100644
index 0000000..5ae8491
Binary files /dev/null and b/docs/img/train.png differ
diff --git a/docs/index.html b/docs/index.html
index 4f23b9e..7856d9a 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  <meta name="description" content="Reinforcement Learning Coach by Intel Nervana.">
   
-  <title>Reinforcement Learning Coach Documentation</title>
-  
-
   <link rel="shortcut icon" href="./img/favicon.ico">
-
-  
+  <title>Home - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="./css/theme.css" type="text/css" />
   <link rel="stylesheet" href="./css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="./css/highlight.css">
   <link href="./extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
-    var mkdocs_page_name = "None";
+    var mkdocs_page_name = "Home";
+    var mkdocs_page_input_path = "index.md";
+    var mkdocs_page_url = "/";
   </script>
   
   <script src="./js/jquery-2.1.1.min.js"></script>
   <script src="./js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="./js/highlight.pack.js"></script>
-  <script src="./js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="./js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="./index.html" class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="./search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,197 +45,152 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Home</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#what-is-coach">What is Coach?</a></li>
-                
-                    <li><a class="toctree-l4" href="#motivation">Motivation</a></li>
-                
-                    <li><a class="toctree-l4" href="#solution">Solution</a></li>
-                
-                    <li><a class="toctree-l4" href="#design">Design</a></li>
-                
-            
-            </ul>
-        
-    </li>
-<li>
-          
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="design/index.html">Design</a>
-        
-    </li>
-<li>
-          
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1 current">
+		
+    <a class="current" href=".">Home</a>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/dqn/index.html">DQN</a>
+    <li class="toctree-l2"><a href="#what-is-coach">What is Coach?</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l3" href="#motivation">Motivation</a></li>
+        
+            <li><a class="toctree-l3" href="#solution">Solution</a></li>
+        
+            <li><a class="toctree-l3" href="#design">Design</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="usage/">Usage</a>
+	    </li>
           
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="contributing/add_agent/index.html">Adding a New Agent</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -251,7 +202,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="./index.html">Reinforcement Learning Coach Documentation</a>
+        <a href=".">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -259,7 +210,7 @@
         <div class="rst-content">
           <div role="navigation" aria-label="breadcrumbs navigation">
   <ul class="wy-breadcrumbs">
-    <li><a href="./index.html">Docs</a> &raquo;</li>
+    <li><a href=".">Docs</a> &raquo;</li>
     
       
     
@@ -281,7 +232,7 @@
 With Coach, it is possible to model an agent by combining various building blocks, and training the agent on multiple environments.
 The available environments allow testing the agent in different practical fields such as robotics, autonomous driving, games and more. 
 Coach collects statistics from the training process and supports advanced visualization techniques for debugging the agent being trained.</p>
-<p>Blog post from the Intel® Nervana™ website can be found <a href="https://www.intelnervana.com/reinforcement-learning-coach-intel">here</a>. </p>
+<p>Blog post from the Intel® AI website can be found <a href="https://ai.intel.com/reinforcement-learning-coach-intel/">here</a>.</p>
 <p>GitHub repository is <a href="https://github.com/NervanaSystems/coach">here</a>. </p>
 <h2 id="design">Design</h2>
 <p><img src="img/design.png" alt="Coach Design" style="width: 800px;"/></p>
@@ -292,7 +243,7 @@ Coach collects statistics from the training process and supports advanced visual
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="design/index.html" class="btn btn-neutral float-right" title="Design"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="usage/" class="btn btn-neutral float-right" title="Usage">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
     </div>
@@ -307,7 +258,7 @@ Coach collects statistics from the training process and supports advanced visual
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -315,20 +266,25 @@ Coach collects statistics from the training process and supports advanced visual
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
       
-        <span style="margin-left: 15px"><a href="design/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="usage/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '.';</script>
+    <script src="./js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="./search/require.js"></script>
+      <script src="./search/search.js"></script>
 
 </body>
 </html>
 
 <!--
-MkDocs version : 0.14.0
-Build Date UTC : 2017-12-18 18:59:45.506407
+MkDocs version : 0.17.5
+Build Date UTC : 2018-08-09 12:14:19
 -->
diff --git a/docs/js/theme.js b/docs/js/theme.js
index 6396162..dda9975 100644
--- a/docs/js/theme.js
+++ b/docs/js/theme.js
@@ -1,5 +1,4 @@
 $( document ).ready(function() {
-
     // Shift nav in mobile when clicking the menu.
     $(document).on('click', "[data-toggle='wy-nav-top']", function() {
       $("[data-toggle='wy-nav-shift']").toggleClass("shift");
@@ -12,6 +11,23 @@ $( document ).ready(function() {
       $("[data-toggle='rst-versions']").toggleClass("shift");
     });
 
+    // Keyboard navigation
+    document.addEventListener("keydown", function(e) {
+        if ($(e.target).is(':input')) return true;
+        var key = e.which || e.keyCode || window.event && window.event.keyCode;
+        var page;
+        switch (key) {
+            case 39:  // right arrow
+                page = $('[role="navigation"] a:contains(Next):first').prop('href');
+                break;
+            case 37:  // left arrow
+                page = $('[role="navigation"] a:contains(Previous):first').prop('href');
+                break;
+            default: break;
+        }
+        if (page) window.location.href = page;
+    });
+
     $(document).on('click', "[data-toggle='rst-current-version']", function() {
       $("[data-toggle='rst-versions']").toggleClass("shift-up");
     });
@@ -53,3 +69,31 @@ window.SphinxRtdTheme = (function (jquery) {
         StickyNav : stickyNav
     };
 }($));
+
+// The code below is a copy of @seanmadsen code posted Jan 10, 2017 on issue 803.
+// https://github.com/mkdocs/mkdocs/issues/803
+// This just incorporates the auto scroll into the theme itself without
+// the need for additional custom.js file.
+//
+$(function() {
+  $.fn.isFullyWithinViewport = function(){
+      var viewport = {};
+      viewport.top = $(window).scrollTop();
+      viewport.bottom = viewport.top + $(window).height();
+      var bounds = {};
+      bounds.top = this.offset().top;
+      bounds.bottom = bounds.top + this.outerHeight();
+      return ( ! (
+        (bounds.top <= viewport.top) ||
+        (bounds.bottom >= viewport.bottom)
+      ) );
+  };
+  if( $('li.toctree-l1.current').length && !$('li.toctree-l1.current').isFullyWithinViewport() ) {
+    $('.wy-nav-side')
+      .scrollTop(
+        $('li.toctree-l1.current').offset().top -
+        $('.wy-nav-side').offset().top -
+        60
+      );
+  }
+});
diff --git a/docs/mkdocs/js/lunr-0.5.7.min.js b/docs/mkdocs/js/lunr-0.5.7.min.js
deleted file mode 100644
index b72449a..0000000
--- a/docs/mkdocs/js/lunr-0.5.7.min.js
+++ /dev/null
@@ -1,7 +0,0 @@
-/**
- * lunr - http://lunrjs.com - A bit like Solr, but much smaller and not as bright - 0.5.7
- * Copyright (C) 2014 Oliver Nightingale
- * MIT Licensed
- * @license
- */
-!function(){var t=function(e){var n=new t.Index;return n.pipeline.add(t.trimmer,t.stopWordFilter,t.stemmer),e&&e.call(n,n),n};t.version="0.5.7",t.utils={},t.utils.warn=function(t){return function(e){t.console&&console.warn&&console.warn(e)}}(this),t.EventEmitter=function(){this.events={}},t.EventEmitter.prototype.addListener=function(){var t=Array.prototype.slice.call(arguments),e=t.pop(),n=t;if("function"!=typeof e)throw new TypeError("last argument must be a function");n.forEach(function(t){this.hasHandler(t)||(this.events[t]=[]),this.events[t].push(e)},this)},t.EventEmitter.prototype.removeListener=function(t,e){if(this.hasHandler(t)){var n=this.events[t].indexOf(e);this.events[t].splice(n,1),this.events[t].length||delete this.events[t]}},t.EventEmitter.prototype.emit=function(t){if(this.hasHandler(t)){var e=Array.prototype.slice.call(arguments,1);this.events[t].forEach(function(t){t.apply(void 0,e)})}},t.EventEmitter.prototype.hasHandler=function(t){return t in this.events},t.tokenizer=function(t){if(!arguments.length||null==t||void 0==t)return[];if(Array.isArray(t))return t.map(function(t){return t.toLowerCase()});for(var e=t.toString().replace(/^\s+/,""),n=e.length-1;n>=0;n--)if(/\S/.test(e.charAt(n))){e=e.substring(0,n+1);break}return e.split(/(?:\s+|\-)/).filter(function(t){return!!t}).map(function(t){return t.toLowerCase()})},t.Pipeline=function(){this._stack=[]},t.Pipeline.registeredFunctions={},t.Pipeline.registerFunction=function(e,n){n in this.registeredFunctions&&t.utils.warn("Overwriting existing registered function: "+n),e.label=n,t.Pipeline.registeredFunctions[e.label]=e},t.Pipeline.warnIfFunctionNotRegistered=function(e){var n=e.label&&e.label in this.registeredFunctions;n||t.utils.warn("Function is not registered with pipeline. This may cause problems when serialising the index.\n",e)},t.Pipeline.load=function(e){var n=new t.Pipeline;return e.forEach(function(e){var i=t.Pipeline.registeredFunctions[e];if(!i)throw new Error("Cannot load un-registered function: "+e);n.add(i)}),n},t.Pipeline.prototype.add=function(){var e=Array.prototype.slice.call(arguments);e.forEach(function(e){t.Pipeline.warnIfFunctionNotRegistered(e),this._stack.push(e)},this)},t.Pipeline.prototype.after=function(e,n){t.Pipeline.warnIfFunctionNotRegistered(n);var i=this._stack.indexOf(e)+1;this._stack.splice(i,0,n)},t.Pipeline.prototype.before=function(e,n){t.Pipeline.warnIfFunctionNotRegistered(n);var i=this._stack.indexOf(e);this._stack.splice(i,0,n)},t.Pipeline.prototype.remove=function(t){var e=this._stack.indexOf(t);this._stack.splice(e,1)},t.Pipeline.prototype.run=function(t){for(var e=[],n=t.length,i=this._stack.length,o=0;n>o;o++){for(var r=t[o],s=0;i>s&&(r=this._stack[s](r,o,t),void 0!==r);s++);void 0!==r&&e.push(r)}return e},t.Pipeline.prototype.reset=function(){this._stack=[]},t.Pipeline.prototype.toJSON=function(){return this._stack.map(function(e){return t.Pipeline.warnIfFunctionNotRegistered(e),e.label})},t.Vector=function(){this._magnitude=null,this.list=void 0,this.length=0},t.Vector.Node=function(t,e,n){this.idx=t,this.val=e,this.next=n},t.Vector.prototype.insert=function(e,n){var i=this.list;if(!i)return this.list=new t.Vector.Node(e,n,i),this.length++;for(var o=i,r=i.next;void 0!=r;){if(e<r.idx)return o.next=new t.Vector.Node(e,n,r),this.length++;o=r,r=r.next}return o.next=new t.Vector.Node(e,n,r),this.length++},t.Vector.prototype.magnitude=function(){if(this._magniture)return this._magnitude;for(var t,e=this.list,n=0;e;)t=e.val,n+=t*t,e=e.next;return this._magnitude=Math.sqrt(n)},t.Vector.prototype.dot=function(t){for(var e=this.list,n=t.list,i=0;e&&n;)e.idx<n.idx?e=e.next:e.idx>n.idx?n=n.next:(i+=e.val*n.val,e=e.next,n=n.next);return i},t.Vector.prototype.similarity=function(t){return this.dot(t)/(this.magnitude()*t.magnitude())},t.SortedSet=function(){this.length=0,this.elements=[]},t.SortedSet.load=function(t){var e=new this;return e.elements=t,e.length=t.length,e},t.SortedSet.prototype.add=function(){Array.prototype.slice.call(arguments).forEach(function(t){~this.indexOf(t)||this.elements.splice(this.locationFor(t),0,t)},this),this.length=this.elements.length},t.SortedSet.prototype.toArray=function(){return this.elements.slice()},t.SortedSet.prototype.map=function(t,e){return this.elements.map(t,e)},t.SortedSet.prototype.forEach=function(t,e){return this.elements.forEach(t,e)},t.SortedSet.prototype.indexOf=function(t,e,n){var e=e||0,n=n||this.elements.length,i=n-e,o=e+Math.floor(i/2),r=this.elements[o];return 1>=i?r===t?o:-1:t>r?this.indexOf(t,o,n):r>t?this.indexOf(t,e,o):r===t?o:void 0},t.SortedSet.prototype.locationFor=function(t,e,n){var e=e||0,n=n||this.elements.length,i=n-e,o=e+Math.floor(i/2),r=this.elements[o];if(1>=i){if(r>t)return o;if(t>r)return o+1}return t>r?this.locationFor(t,o,n):r>t?this.locationFor(t,e,o):void 0},t.SortedSet.prototype.intersect=function(e){for(var n=new t.SortedSet,i=0,o=0,r=this.length,s=e.length,a=this.elements,h=e.elements;;){if(i>r-1||o>s-1)break;a[i]!==h[o]?a[i]<h[o]?i++:a[i]>h[o]&&o++:(n.add(a[i]),i++,o++)}return n},t.SortedSet.prototype.clone=function(){var e=new t.SortedSet;return e.elements=this.toArray(),e.length=e.elements.length,e},t.SortedSet.prototype.union=function(t){var e,n,i;return this.length>=t.length?(e=this,n=t):(e=t,n=this),i=e.clone(),i.add.apply(i,n.toArray()),i},t.SortedSet.prototype.toJSON=function(){return this.toArray()},t.Index=function(){this._fields=[],this._ref="id",this.pipeline=new t.Pipeline,this.documentStore=new t.Store,this.tokenStore=new t.TokenStore,this.corpusTokens=new t.SortedSet,this.eventEmitter=new t.EventEmitter,this._idfCache={},this.on("add","remove","update",function(){this._idfCache={}}.bind(this))},t.Index.prototype.on=function(){var t=Array.prototype.slice.call(arguments);return this.eventEmitter.addListener.apply(this.eventEmitter,t)},t.Index.prototype.off=function(t,e){return this.eventEmitter.removeListener(t,e)},t.Index.load=function(e){e.version!==t.version&&t.utils.warn("version mismatch: current "+t.version+" importing "+e.version);var n=new this;return n._fields=e.fields,n._ref=e.ref,n.documentStore=t.Store.load(e.documentStore),n.tokenStore=t.TokenStore.load(e.tokenStore),n.corpusTokens=t.SortedSet.load(e.corpusTokens),n.pipeline=t.Pipeline.load(e.pipeline),n},t.Index.prototype.field=function(t,e){var e=e||{},n={name:t,boost:e.boost||1};return this._fields.push(n),this},t.Index.prototype.ref=function(t){return this._ref=t,this},t.Index.prototype.add=function(e,n){var i={},o=new t.SortedSet,r=e[this._ref],n=void 0===n?!0:n;this._fields.forEach(function(n){var r=this.pipeline.run(t.tokenizer(e[n.name]));i[n.name]=r,t.SortedSet.prototype.add.apply(o,r)},this),this.documentStore.set(r,o),t.SortedSet.prototype.add.apply(this.corpusTokens,o.toArray());for(var s=0;s<o.length;s++){var a=o.elements[s],h=this._fields.reduce(function(t,e){var n=i[e.name].length;if(!n)return t;var o=i[e.name].filter(function(t){return t===a}).length;return t+o/n*e.boost},0);this.tokenStore.add(a,{ref:r,tf:h})}n&&this.eventEmitter.emit("add",e,this)},t.Index.prototype.remove=function(t,e){var n=t[this._ref],e=void 0===e?!0:e;if(this.documentStore.has(n)){var i=this.documentStore.get(n);this.documentStore.remove(n),i.forEach(function(t){this.tokenStore.remove(t,n)},this),e&&this.eventEmitter.emit("remove",t,this)}},t.Index.prototype.update=function(t,e){var e=void 0===e?!0:e;this.remove(t,!1),this.add(t,!1),e&&this.eventEmitter.emit("update",t,this)},t.Index.prototype.idf=function(t){var e="@"+t;if(Object.prototype.hasOwnProperty.call(this._idfCache,e))return this._idfCache[e];var n=this.tokenStore.count(t),i=1;return n>0&&(i=1+Math.log(this.tokenStore.length/n)),this._idfCache[e]=i},t.Index.prototype.search=function(e){var n=this.pipeline.run(t.tokenizer(e)),i=new t.Vector,o=[],r=this._fields.reduce(function(t,e){return t+e.boost},0),s=n.some(function(t){return this.tokenStore.has(t)},this);if(!s)return[];n.forEach(function(e,n,s){var a=1/s.length*this._fields.length*r,h=this,u=this.tokenStore.expand(e).reduce(function(n,o){var r=h.corpusTokens.indexOf(o),s=h.idf(o),u=1,l=new t.SortedSet;if(o!==e){var c=Math.max(3,o.length-e.length);u=1/Math.log(c)}return r>-1&&i.insert(r,a*s*u),Object.keys(h.tokenStore.get(o)).forEach(function(t){l.add(t)}),n.union(l)},new t.SortedSet);o.push(u)},this);var a=o.reduce(function(t,e){return t.intersect(e)});return a.map(function(t){return{ref:t,score:i.similarity(this.documentVector(t))}},this).sort(function(t,e){return e.score-t.score})},t.Index.prototype.documentVector=function(e){for(var n=this.documentStore.get(e),i=n.length,o=new t.Vector,r=0;i>r;r++){var s=n.elements[r],a=this.tokenStore.get(s)[e].tf,h=this.idf(s);o.insert(this.corpusTokens.indexOf(s),a*h)}return o},t.Index.prototype.toJSON=function(){return{version:t.version,fields:this._fields,ref:this._ref,documentStore:this.documentStore.toJSON(),tokenStore:this.tokenStore.toJSON(),corpusTokens:this.corpusTokens.toJSON(),pipeline:this.pipeline.toJSON()}},t.Index.prototype.use=function(t){var e=Array.prototype.slice.call(arguments,1);e.unshift(this),t.apply(this,e)},t.Store=function(){this.store={},this.length=0},t.Store.load=function(e){var n=new this;return n.length=e.length,n.store=Object.keys(e.store).reduce(function(n,i){return n[i]=t.SortedSet.load(e.store[i]),n},{}),n},t.Store.prototype.set=function(t,e){this.has(t)||this.length++,this.store[t]=e},t.Store.prototype.get=function(t){return this.store[t]},t.Store.prototype.has=function(t){return t in this.store},t.Store.prototype.remove=function(t){this.has(t)&&(delete this.store[t],this.length--)},t.Store.prototype.toJSON=function(){return{store:this.store,length:this.length}},t.stemmer=function(){var t={ational:"ate",tional:"tion",enci:"ence",anci:"ance",izer:"ize",bli:"ble",alli:"al",entli:"ent",eli:"e",ousli:"ous",ization:"ize",ation:"ate",ator:"ate",alism:"al",iveness:"ive",fulness:"ful",ousness:"ous",aliti:"al",iviti:"ive",biliti:"ble",logi:"log"},e={icate:"ic",ative:"",alize:"al",iciti:"ic",ical:"ic",ful:"",ness:""},n="[^aeiou]",i="[aeiouy]",o=n+"[^aeiouy]*",r=i+"[aeiou]*",s="^("+o+")?"+r+o,a="^("+o+")?"+r+o+"("+r+")?$",h="^("+o+")?"+r+o+r+o,u="^("+o+")?"+i,l=new RegExp(s),c=new RegExp(h),p=new RegExp(a),f=new RegExp(u),d=/^(.+?)(ss|i)es$/,v=/^(.+?)([^s])s$/,m=/^(.+?)eed$/,g=/^(.+?)(ed|ing)$/,y=/.$/,S=/(at|bl|iz)$/,w=new RegExp("([^aeiouylsz])\\1$"),x=new RegExp("^"+o+i+"[^aeiouwxy]$"),k=/^(.+?[^aeiou])y$/,b=/^(.+?)(ational|tional|enci|anci|izer|bli|alli|entli|eli|ousli|ization|ation|ator|alism|iveness|fulness|ousness|aliti|iviti|biliti|logi)$/,E=/^(.+?)(icate|ative|alize|iciti|ical|ful|ness)$/,_=/^(.+?)(al|ance|ence|er|ic|able|ible|ant|ement|ment|ent|ou|ism|ate|iti|ous|ive|ize)$/,O=/^(.+?)(s|t)(ion)$/,F=/^(.+?)e$/,P=/ll$/,T=new RegExp("^"+o+i+"[^aeiouwxy]$"),$=function(n){var i,o,r,s,a,h,u;if(n.length<3)return n;if(r=n.substr(0,1),"y"==r&&(n=r.toUpperCase()+n.substr(1)),s=d,a=v,s.test(n)?n=n.replace(s,"$1$2"):a.test(n)&&(n=n.replace(a,"$1$2")),s=m,a=g,s.test(n)){var $=s.exec(n);s=l,s.test($[1])&&(s=y,n=n.replace(s,""))}else if(a.test(n)){var $=a.exec(n);i=$[1],a=f,a.test(i)&&(n=i,a=S,h=w,u=x,a.test(n)?n+="e":h.test(n)?(s=y,n=n.replace(s,"")):u.test(n)&&(n+="e"))}if(s=k,s.test(n)){var $=s.exec(n);i=$[1],n=i+"i"}if(s=b,s.test(n)){var $=s.exec(n);i=$[1],o=$[2],s=l,s.test(i)&&(n=i+t[o])}if(s=E,s.test(n)){var $=s.exec(n);i=$[1],o=$[2],s=l,s.test(i)&&(n=i+e[o])}if(s=_,a=O,s.test(n)){var $=s.exec(n);i=$[1],s=c,s.test(i)&&(n=i)}else if(a.test(n)){var $=a.exec(n);i=$[1]+$[2],a=c,a.test(i)&&(n=i)}if(s=F,s.test(n)){var $=s.exec(n);i=$[1],s=c,a=p,h=T,(s.test(i)||a.test(i)&&!h.test(i))&&(n=i)}return s=P,a=c,s.test(n)&&a.test(n)&&(s=y,n=n.replace(s,"")),"y"==r&&(n=r.toLowerCase()+n.substr(1)),n};return $}(),t.Pipeline.registerFunction(t.stemmer,"stemmer"),t.stopWordFilter=function(e){return-1===t.stopWordFilter.stopWords.indexOf(e)?e:void 0},t.stopWordFilter.stopWords=new t.SortedSet,t.stopWordFilter.stopWords.length=119,t.stopWordFilter.stopWords.elements=["","a","able","about","across","after","all","almost","also","am","among","an","and","any","are","as","at","be","because","been","but","by","can","cannot","could","dear","did","do","does","either","else","ever","every","for","from","get","got","had","has","have","he","her","hers","him","his","how","however","i","if","in","into","is","it","its","just","least","let","like","likely","may","me","might","most","must","my","neither","no","nor","not","of","off","often","on","only","or","other","our","own","rather","said","say","says","she","should","since","so","some","than","that","the","their","them","then","there","these","they","this","tis","to","too","twas","us","wants","was","we","were","what","when","where","which","while","who","whom","why","will","with","would","yet","you","your"],t.Pipeline.registerFunction(t.stopWordFilter,"stopWordFilter"),t.trimmer=function(t){return t.replace(/^\W+/,"").replace(/\W+$/,"")},t.Pipeline.registerFunction(t.trimmer,"trimmer"),t.TokenStore=function(){this.root={docs:{}},this.length=0},t.TokenStore.load=function(t){var e=new this;return e.root=t.root,e.length=t.length,e},t.TokenStore.prototype.add=function(t,e,n){var n=n||this.root,i=t[0],o=t.slice(1);return i in n||(n[i]={docs:{}}),0===o.length?(n[i].docs[e.ref]=e,void(this.length+=1)):this.add(o,e,n[i])},t.TokenStore.prototype.has=function(t){if(!t)return!1;for(var e=this.root,n=0;n<t.length;n++){if(!e[t[n]])return!1;e=e[t[n]]}return!0},t.TokenStore.prototype.getNode=function(t){if(!t)return{};for(var e=this.root,n=0;n<t.length;n++){if(!e[t[n]])return{};e=e[t[n]]}return e},t.TokenStore.prototype.get=function(t,e){return this.getNode(t,e).docs||{}},t.TokenStore.prototype.count=function(t,e){return Object.keys(this.get(t,e)).length},t.TokenStore.prototype.remove=function(t,e){if(t){for(var n=this.root,i=0;i<t.length;i++){if(!(t[i]in n))return;n=n[t[i]]}delete n.docs[e]}},t.TokenStore.prototype.expand=function(t,e){var n=this.getNode(t),i=n.docs||{},e=e||[];return Object.keys(i).length&&e.push(t),Object.keys(n).forEach(function(n){"docs"!==n&&e.concat(this.expand(t+n,e))},this),e},t.TokenStore.prototype.toJSON=function(){return{root:this.root,length:this.length}},function(t,e){"function"==typeof define&&define.amd?define(e):"object"==typeof exports?module.exports=e():t.lunr=e()}(this,function(){return t})}();
diff --git a/docs/mkdocs/search_index.txt b/docs/mkdocs/search_index.txt
deleted file mode 100644
index f501fdf..0000000
--- a/docs/mkdocs/search_index.txt
+++ /dev/null
@@ -1,454 +0,0 @@
-{
-    "docs": [
-        {
-            "location": "/index.html",
-            "text": "What is Coach?\n\n\nMotivation\n\n\nTrain and evaluate reinforcement learning agents by harnessing the power of multi-core CPU processing to achieve state-of-the-art results. Provide a sandbox for easing the development process of new algorithms through a modular design and an elegant set of APIs. \n\n\nSolution\n\n\nCoach is a python environment which models the interaction between an agent and an environment in a modular way.\nWith Coach, it is possible to model an agent by combining various building blocks, and training the agent on multiple environments.\nThe available environments allow testing the agent in different practical fields such as robotics, autonomous driving, games and more. \nCoach collects statistics from the training process and supports advanced visualization techniques for debugging the agent being trained.\n\n\nBlog post from the Intel\u00ae Nervana\u2122 website can be found \nhere\n. \n\n\nGitHub repository is \nhere\n. \n\n\nDesign",
-            "title": "Home"
-        },
-        {
-            "location": "/index.html#what-is-coach",
-            "text": "",
-            "title": "What is Coach?"
-        },
-        {
-            "location": "/index.html#motivation",
-            "text": "Train and evaluate reinforcement learning agents by harnessing the power of multi-core CPU processing to achieve state-of-the-art results. Provide a sandbox for easing the development process of new algorithms through a modular design and an elegant set of APIs.",
-            "title": "Motivation"
-        },
-        {
-            "location": "/index.html#solution",
-            "text": "Coach is a python environment which models the interaction between an agent and an environment in a modular way.\nWith Coach, it is possible to model an agent by combining various building blocks, and training the agent on multiple environments.\nThe available environments allow testing the agent in different practical fields such as robotics, autonomous driving, games and more. \nCoach collects statistics from the training process and supports advanced visualization techniques for debugging the agent being trained.  Blog post from the Intel\u00ae Nervana\u2122 website can be found  here .   GitHub repository is  here .",
-            "title": "Solution"
-        },
-        {
-            "location": "/index.html#design",
-            "text": "",
-            "title": "Design"
-        },
-        {
-            "location": "/design/index.html",
-            "text": "Coach Design\n\n\nNetwork Design\n\n\nEach agent has at least one neural network, used as the function approximator, for choosing the actions. The network is designed in a modular way to allow reusability in different agents. It is separated into three main parts:\n\n\n\n\n\n\nInput Embedders\n - This is the first stage of the network, meant to convert the input into a feature vector representation. It is possible to combine several instances of any of the supported embedders, in order to allow varied combinations of inputs. \n\n\nThere are two main types of input embedders: \n\n\n\n\nImage embedder - Convolutional neural network. \n\n\nVector embedder - Multi-layer perceptron. \n\n\n\n\n\n\n\n\nMiddlewares\n - The middleware gets the output of the input embedder, and processes it into a different representation domain, before sending it through the output head. The goal of the middleware is to enable processing the combined outputs of several input embedders, and pass them through some extra processing. This, for instance, might include an LSTM or just a plain simple FC layer.\n\n\n\n\n\n\nOutput Heads\n - The output head is used in order to predict the values required from the network. These might include action-values, state-values or a policy. As with the input embedders, it is possible to use several output heads in the same network. For example, the \nActor Critic\n agent combines two heads - a policy head and a state-value head.\n  In addition, the output heads defines the loss function according to the head type.\n\n\n\n\n\n\n\u200b\n\n\n\n\n\n\n\n\n\n\n\nKeeping Network Copies in Sync\n\n\nMost of the reinforcement learning agents include more than one copy of the neural network. These copies serve as counterparts of the main network which are updated in different rates, and are often synchronized either locally or between parallel workers. For easier synchronization of those copies, a wrapper around these copies exposes a simplified API, which allows hiding these complexities from the agent. \n\n\n\n\n\n\n\n\n\n\n\nSupported Algorithms\n\n\nCoach supports many state-of-the-art reinforcement learning algorithms, which are separated into two main classes - value optimization and policy optimization. A detailed description of those algorithms may be found in the algorithms section.",
-            "title": "Design"
-        },
-        {
-            "location": "/design/index.html#coach-design",
-            "text": "",
-            "title": "Coach Design"
-        },
-        {
-            "location": "/design/index.html#network-design",
-            "text": "Each agent has at least one neural network, used as the function approximator, for choosing the actions. The network is designed in a modular way to allow reusability in different agents. It is separated into three main parts:    Input Embedders  - This is the first stage of the network, meant to convert the input into a feature vector representation. It is possible to combine several instances of any of the supported embedders, in order to allow varied combinations of inputs.   There are two main types of input embedders:    Image embedder - Convolutional neural network.   Vector embedder - Multi-layer perceptron.      Middlewares  - The middleware gets the output of the input embedder, and processes it into a different representation domain, before sending it through the output head. The goal of the middleware is to enable processing the combined outputs of several input embedders, and pass them through some extra processing. This, for instance, might include an LSTM or just a plain simple FC layer.    Output Heads  - The output head is used in order to predict the values required from the network. These might include action-values, state-values or a policy. As with the input embedders, it is possible to use several output heads in the same network. For example, the  Actor Critic  agent combines two heads - a policy head and a state-value head.\n  In addition, the output heads defines the loss function according to the head type.    \u200b",
-            "title": "Network Design"
-        },
-        {
-            "location": "/design/index.html#keeping-network-copies-in-sync",
-            "text": "Most of the reinforcement learning agents include more than one copy of the neural network. These copies serve as counterparts of the main network which are updated in different rates, and are often synchronized either locally or between parallel workers. For easier synchronization of those copies, a wrapper around these copies exposes a simplified API, which allows hiding these complexities from the agent.",
-            "title": "Keeping Network Copies in Sync"
-        },
-        {
-            "location": "/design/index.html#supported-algorithms",
-            "text": "Coach supports many state-of-the-art reinforcement learning algorithms, which are separated into two main classes - value optimization and policy optimization. A detailed description of those algorithms may be found in the algorithms section.",
-            "title": "Supported Algorithms"
-        },
-        {
-            "location": "/usage/index.html",
-            "text": "Coach Usage\n\n\nTraining an Agent\n\n\nSingle-threaded Algorithms\n\n\nThis is the most common case. Just choose a preset using the \n-p\n flag and press enter.\n\n\nExample:\n\n\npython coach.py -p CartPole_DQN\n\n\nMulti-threaded Algorithms\n\n\nMulti-threaded algorithms are very common this days.\nThey typically achieve the best results, and scale gracefully with the number of threads.\nIn Coach, running such algorithms is done by selecting a suitable preset, and choosing the number of threads to run using the \n-n\n flag.\n\n\nExample:\n\n\npython coach.py -p CartPole_A3C -n 8\n\n\nEvaluating an Agent\n\n\nThere are several options for evaluating an agent during the training:\n\n\n\n\n\n\nFor multi-threaded runs, an evaluation agent will constantly run in the background and evaluate the model during the training.\n\n\n\n\n\n\nFor single-threaded runs, it is possible to define an evaluation period through the preset. This will run several episodes of evaluation once in a while.\n\n\n\n\n\n\nAdditionally, it is possible to save checkpoints of the agents networks and then run only in evaluation mode.\nSaving checkpoints can be done by specifying the number of seconds between storing checkpoints using the \n-s\n flag.\nThe checkpoints will be saved into the experiment directory.\nLoading a model for evaluation can be done by specifying the \n-crd\n flag with the experiment directory, and the \n--evaluate\n flag to disable training.\n\n\nExample:\n\n\npython coach.py -p CartPole_DQN -s 60\n\n\npython coach.py -p CartPole_DQN --evaluate -crd CHECKPOINT_RESTORE_DIR\n\n\nPlaying with the Environment as a Human\n\n\nInteracting with the environment as a human can be useful for understanding its difficulties and for collecting data for imitation learning.\nIn Coach, this can be easily done by selecting a preset that defines the environment to use, and specifying the \n--play\n flag.\nWhen the environment is loaded, the available keyboard buttons will be printed to the screen.\nPressing the escape key when finished will end the simulation and store the replay buffer in the experiment dir.\n\n\nExample:\n\n\npython coach.py -p Breakout_DQN --play\n\n\nLearning Through Imitation Learning\n\n\nLearning through imitation of human behavior is a nice way to speedup the learning.\nIn Coach, this can be done in two steps -\n\n\n\n\n\n\nCreate a dataset of demonstrations by playing with the environment as a human.\n   After this step, a pickle of the replay buffer containing your game play will be stored in the experiment directory.\n   The path to this replay buffer will be printed to the screen.\n   To do so, you should select an environment type and level through the command line, and specify the \n--play\n flag.\n\n\nExample:\n\n\npython coach.py -et Doom -lvl Basic --play\n\n\n\n\n\n\nNext, use an imitation learning preset and set the replay buffer path accordingly.\n    The path can be set either from the command line or from the preset itself.\n\n\nExample:\n\n\npython coach.py -p Doom_Basic_BC -cp='agent.load_memory_from_file_path=\\\"<experiment dir>/replay_buffer.p\\\"'\n\n\n\n\n\n\nVisualizations\n\n\nRendering the Environment\n\n\nRendering the environment can be done by using the \n-r\n flag.\nWhen working with multi-threaded algorithms, the rendered image will be representing the game play of the evaluation worker.\nWhen working with single-threaded algorithms, the rendered image will be representing the single worker which can be either training or evaluating.\nKeep in mind that rendering the environment in single-threaded algorithms may slow the training to some extent.\nWhen playing with the environment using the \n--play\n flag, the environment will be rendered automatically without the need for specifying the \n-r\n flag.\n\n\nExample:\n\n\npython coach.py -p Breakout_DQN -r\n\n\nDumping GIFs\n\n\nCoach allows storing GIFs of the agent game play.\nTo dump GIF files, use the \n-dg\n flag.\nThe files are dumped after every evaluation episode, and are saved into the experiment directory, under a gifs sub-directory.\n\n\nExample:\n\n\npython coach.py -p Breakout_A3C -n 4 -dg\n\n\nSwitching between deep learning frameworks\n\n\nCoach uses TensorFlow as its main backend framework, but it also supports neon for some of the algorithms.\nBy default, TensorFlow will be used. It is possible to switch to neon using the \n-f\n flag.\n\n\nExample:\n\n\npython coach.py -p Doom_Basic_DQN -f neon\n\n\nAdditional Flags\n\n\nThere are several convenient flags which are important to know about.\nHere we will list most of the flags, but these can be updated from time to time.\nThe most up to date description can be found by using the \n-h\n flag.\n\n\n\n\n\n\n\n\nFlag\n\n\nType\n\n\nDescription\n\n\n\n\n\n\n\n\n\n\n-p PRESET\n, \n`--preset PRESET\n\n\nstring\n\n\nName of a preset to run (as configured in presets.py)\n\n\n\n\n\n\n-l\n, \n--list\n\n\nflag\n\n\nList all available presets\n\n\n\n\n\n\n-e EXPERIMENT_NAME\n, \n--experiment_name EXPERIMENT_NAME\n\n\nstring\n\n\nExperiment name to be used to store the results.\n\n\n\n\n\n\n-r\n, \n--render\n\n\nflag\n\n\nRender environment\n\n\n\n\n\n\n-f FRAMEWORK\n, \n--framework FRAMEWORK\n\n\nstring\n\n\nNeural network framework. Available values: tensorflow, neon\n\n\n\n\n\n\n-n NUM_WORKERS\n, \n--num_workers NUM_WORKERS\n\n\nint\n\n\nNumber of workers for multi-process based agents, e.g. A3C\n\n\n\n\n\n\n--play\n\n\nflag\n\n\nPlay as a human by controlling the game with the keyboard. This option will save a replay buffer with the game play.\n\n\n\n\n\n\n--evaluate\n\n\nflag\n\n\nRun evaluation only. This is a convenient way to disable training in order to evaluate an existing checkpoint.\n\n\n\n\n\n\n-v\n, \n--verbose\n\n\nflag\n\n\nDon't suppress TensorFlow debug prints.\n\n\n\n\n\n\n-s SAVE_MODEL_SEC\n, \n--save_model_sec SAVE_MODEL_SEC\n\n\nint\n\n\nTime in seconds between saving checkpoints of the model.\n\n\n\n\n\n\n-crd CHECKPOINT_RESTORE_DIR\n, \n--checkpoint_restore_dir CHECKPOINT_RESTORE_DIR\n\n\nstring\n\n\nPath to a folder containing a checkpoint to restore the model from.\n\n\n\n\n\n\n-dg\n, \n--dump_gifs\n\n\nflag\n\n\nEnable the gif saving functionality.\n\n\n\n\n\n\n-at AGENT_TYPE\n, \n--agent_type AGENT_TYPE\n\n\nstring\n\n\nChoose an agent type class to override on top of the selected preset. If no preset is defined, a preset can be set from the command-line by combining settings which are set by using \n--agent_type\n, \n--experiment_type\n, \n--environemnt_type\n\n\n\n\n\n\n-et ENVIRONMENT_TYPE\n, \n--environment_type ENVIRONMENT_TYPE\n\n\nstring\n\n\nChoose an environment type class to override on top of the selected preset. If no preset is defined, a preset can be set from the command-line by combining settings which are set by using \n--agent_type\n, \n--experiment_type\n, \n--environemnt_type\n\n\n\n\n\n\n-ept EXPLORATION_POLICY_TYPE\n, \n--exploration_policy_type EXPLORATION_POLICY_TYPE\n\n\nstring\n\n\nChoose an exploration policy type class to override on top of the selected preset.If no preset is defined, a preset can be set from the command-line by combining settings which are set by using \n--agent_type\n, \n--experiment_type\n, \n--environemnt_type\n\n\n\n\n\n\n-lvl LEVEL\n, \n--level LEVEL\n\n\nstring\n\n\nChoose the level that will be played in the environment that was selected. This value will override the level parameter in the environment class.\n\n\n\n\n\n\n-cp CUSTOM_PARAMETER\n, \n--custom_parameter CUSTOM_PARAMETER\n\n\nstring\n\n\nSemicolon separated parameters used to override specific parameters on top of the selected preset (or on top of the command-line assembled one). Whenever a parameter value is a string, it should be inputted as \n'\\\"string\\\"'\n. For ex.: \n\"visualization.render=False;\n \nnum_training_iterations=500;\n \noptimizer='rmsprop'\"",
-            "title": "Usage"
-        },
-        {
-            "location": "/usage/index.html#coach-usage",
-            "text": "",
-            "title": "Coach Usage"
-        },
-        {
-            "location": "/usage/index.html#training-an-agent",
-            "text": "Single-threaded Algorithms  This is the most common case. Just choose a preset using the  -p  flag and press enter.  Example:  python coach.py -p CartPole_DQN  Multi-threaded Algorithms  Multi-threaded algorithms are very common this days.\nThey typically achieve the best results, and scale gracefully with the number of threads.\nIn Coach, running such algorithms is done by selecting a suitable preset, and choosing the number of threads to run using the  -n  flag.  Example:  python coach.py -p CartPole_A3C -n 8",
-            "title": "Training an Agent"
-        },
-        {
-            "location": "/usage/index.html#evaluating-an-agent",
-            "text": "There are several options for evaluating an agent during the training:    For multi-threaded runs, an evaluation agent will constantly run in the background and evaluate the model during the training.    For single-threaded runs, it is possible to define an evaluation period through the preset. This will run several episodes of evaluation once in a while.    Additionally, it is possible to save checkpoints of the agents networks and then run only in evaluation mode.\nSaving checkpoints can be done by specifying the number of seconds between storing checkpoints using the  -s  flag.\nThe checkpoints will be saved into the experiment directory.\nLoading a model for evaluation can be done by specifying the  -crd  flag with the experiment directory, and the  --evaluate  flag to disable training.  Example:  python coach.py -p CartPole_DQN -s 60  python coach.py -p CartPole_DQN --evaluate -crd CHECKPOINT_RESTORE_DIR",
-            "title": "Evaluating an Agent"
-        },
-        {
-            "location": "/usage/index.html#playing-with-the-environment-as-a-human",
-            "text": "Interacting with the environment as a human can be useful for understanding its difficulties and for collecting data for imitation learning.\nIn Coach, this can be easily done by selecting a preset that defines the environment to use, and specifying the  --play  flag.\nWhen the environment is loaded, the available keyboard buttons will be printed to the screen.\nPressing the escape key when finished will end the simulation and store the replay buffer in the experiment dir.  Example:  python coach.py -p Breakout_DQN --play",
-            "title": "Playing with the Environment as a Human"
-        },
-        {
-            "location": "/usage/index.html#learning-through-imitation-learning",
-            "text": "Learning through imitation of human behavior is a nice way to speedup the learning.\nIn Coach, this can be done in two steps -    Create a dataset of demonstrations by playing with the environment as a human.\n   After this step, a pickle of the replay buffer containing your game play will be stored in the experiment directory.\n   The path to this replay buffer will be printed to the screen.\n   To do so, you should select an environment type and level through the command line, and specify the  --play  flag.  Example:  python coach.py -et Doom -lvl Basic --play    Next, use an imitation learning preset and set the replay buffer path accordingly.\n    The path can be set either from the command line or from the preset itself.  Example:  python coach.py -p Doom_Basic_BC -cp='agent.load_memory_from_file_path=\\\"<experiment dir>/replay_buffer.p\\\"'",
-            "title": "Learning Through Imitation Learning"
-        },
-        {
-            "location": "/usage/index.html#visualizations",
-            "text": "Rendering the Environment  Rendering the environment can be done by using the  -r  flag.\nWhen working with multi-threaded algorithms, the rendered image will be representing the game play of the evaluation worker.\nWhen working with single-threaded algorithms, the rendered image will be representing the single worker which can be either training or evaluating.\nKeep in mind that rendering the environment in single-threaded algorithms may slow the training to some extent.\nWhen playing with the environment using the  --play  flag, the environment will be rendered automatically without the need for specifying the  -r  flag.  Example:  python coach.py -p Breakout_DQN -r  Dumping GIFs  Coach allows storing GIFs of the agent game play.\nTo dump GIF files, use the  -dg  flag.\nThe files are dumped after every evaluation episode, and are saved into the experiment directory, under a gifs sub-directory.  Example:  python coach.py -p Breakout_A3C -n 4 -dg",
-            "title": "Visualizations"
-        },
-        {
-            "location": "/usage/index.html#switching-between-deep-learning-frameworks",
-            "text": "Coach uses TensorFlow as its main backend framework, but it also supports neon for some of the algorithms.\nBy default, TensorFlow will be used. It is possible to switch to neon using the  -f  flag.  Example:  python coach.py -p Doom_Basic_DQN -f neon",
-            "title": "Switching between deep learning frameworks"
-        },
-        {
-            "location": "/usage/index.html#additional-flags",
-            "text": "There are several convenient flags which are important to know about.\nHere we will list most of the flags, but these can be updated from time to time.\nThe most up to date description can be found by using the  -h  flag.     Flag  Type  Description      -p PRESET ,  `--preset PRESET  string  Name of a preset to run (as configured in presets.py)    -l ,  --list  flag  List all available presets    -e EXPERIMENT_NAME ,  --experiment_name EXPERIMENT_NAME  string  Experiment name to be used to store the results.    -r ,  --render  flag  Render environment    -f FRAMEWORK ,  --framework FRAMEWORK  string  Neural network framework. Available values: tensorflow, neon    -n NUM_WORKERS ,  --num_workers NUM_WORKERS  int  Number of workers for multi-process based agents, e.g. A3C    --play  flag  Play as a human by controlling the game with the keyboard. This option will save a replay buffer with the game play.    --evaluate  flag  Run evaluation only. This is a convenient way to disable training in order to evaluate an existing checkpoint.    -v ,  --verbose  flag  Don't suppress TensorFlow debug prints.    -s SAVE_MODEL_SEC ,  --save_model_sec SAVE_MODEL_SEC  int  Time in seconds between saving checkpoints of the model.    -crd CHECKPOINT_RESTORE_DIR ,  --checkpoint_restore_dir CHECKPOINT_RESTORE_DIR  string  Path to a folder containing a checkpoint to restore the model from.    -dg ,  --dump_gifs  flag  Enable the gif saving functionality.    -at AGENT_TYPE ,  --agent_type AGENT_TYPE  string  Choose an agent type class to override on top of the selected preset. If no preset is defined, a preset can be set from the command-line by combining settings which are set by using  --agent_type ,  --experiment_type ,  --environemnt_type    -et ENVIRONMENT_TYPE ,  --environment_type ENVIRONMENT_TYPE  string  Choose an environment type class to override on top of the selected preset. If no preset is defined, a preset can be set from the command-line by combining settings which are set by using  --agent_type ,  --experiment_type ,  --environemnt_type    -ept EXPLORATION_POLICY_TYPE ,  --exploration_policy_type EXPLORATION_POLICY_TYPE  string  Choose an exploration policy type class to override on top of the selected preset.If no preset is defined, a preset can be set from the command-line by combining settings which are set by using  --agent_type ,  --experiment_type ,  --environemnt_type    -lvl LEVEL ,  --level LEVEL  string  Choose the level that will be played in the environment that was selected. This value will override the level parameter in the environment class.    -cp CUSTOM_PARAMETER ,  --custom_parameter CUSTOM_PARAMETER  string  Semicolon separated parameters used to override specific parameters on top of the selected preset (or on top of the command-line assembled one). Whenever a parameter value is a string, it should be inputted as  '\\\"string\\\"' . For ex.:  \"visualization.render=False;   num_training_iterations=500;   optimizer='rmsprop'\"",
-            "title": "Additional Flags"
-        },
-        {
-            "location": "/algorithms/value_optimization/dqn/index.html",
-            "text": "Deep Q Networks\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nPlaying Atari with Deep Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\n\n\nSample a batch of transitions from the replay buffer. \n\n\nUsing the next states from the sampled batch, run the target network to calculate the \n Q \n values for each of the actions \n Q(s_{t+1},a) \n, and keep only the maximum value for each state. \n\n\nIn order to zero out the updates for the actions that were not played (resulting from zeroing the MSE loss), use the current states from the sampled batch, and run the online network to get the current Q values predictions. Set those values as the targets for the actions that were not actually played. \n\n\n\n\nFor each action that was played, use the following equation for calculating the targets of the network:\u200b                                                         \n y_t=r(s_t,a_t)+\u03b3\\cdot max_a {Q(s_{t+1},a)} \n\n\n\n\n\n\n\n\nFinally, train the online network using the current states as inputs, and with the aforementioned targets. \n\n\n\n\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "DQN"
-        },
-        {
-            "location": "/algorithms/value_optimization/dqn/index.html#deep-q-networks",
-            "text": "Actions space:  Discrete  References:   Playing Atari with Deep Reinforcement Learning",
-            "title": "Deep Q Networks"
-        },
-        {
-            "location": "/algorithms/value_optimization/dqn/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/dqn/index.html#algorithm-description",
-            "text": "Training the network   Sample a batch of transitions from the replay buffer.   Using the next states from the sampled batch, run the target network to calculate the   Q   values for each of the actions   Q(s_{t+1},a)  , and keep only the maximum value for each state.   In order to zero out the updates for the actions that were not played (resulting from zeroing the MSE loss), use the current states from the sampled batch, and run the online network to get the current Q values predictions. Set those values as the targets for the actions that were not actually played.    For each action that was played, use the following equation for calculating the targets of the network:\u200b                                                           y_t=r(s_t,a_t)+\u03b3\\cdot max_a {Q(s_{t+1},a)}      Finally, train the online network using the current states as inputs, and with the aforementioned targets.    Once in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/value_optimization/double_dqn/index.html",
-            "text": "Double DQN\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nDeep Reinforcement Learning with Double Q-learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\n\n\nSample a batch of transitions from the replay buffer. \n\n\nUsing the next states from the sampled batch, run the online network in order to find the \nQ\n maximizing action \nargmax_a Q(s_{t+1},a)\n. For these actions, use the corresponding next states and run the target network to calculate \nQ(s_{t+1},argmax_a Q(s_{t+1},a))\n.\n\n\nIn order to zero out the updates for the actions that were not played (resulting from zeroing the MSE loss), use the current states from the sampled batch, and run the online network to get the current Q values predictions. Set those values as the targets for the actions that were not actually played. \n\n\n\n\nFor each action that was played, use the following equation for calculating the targets of the network:\n   \n y_t=r(s_t,a_t )+\\gamma \\cdot Q(s_{t+1},argmax_a Q(s_{t+1},a)) \n\n\n\n\n\n\n\n\nFinally, train the online network using the current states as inputs, and with the aforementioned targets. \n\n\n\n\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "Double DQN"
-        },
-        {
-            "location": "/algorithms/value_optimization/double_dqn/index.html#double-dqn",
-            "text": "Actions space:  Discrete  References:   Deep Reinforcement Learning with Double Q-learning",
-            "title": "Double DQN"
-        },
-        {
-            "location": "/algorithms/value_optimization/double_dqn/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/double_dqn/index.html#algorithm-description",
-            "text": "Training the network   Sample a batch of transitions from the replay buffer.   Using the next states from the sampled batch, run the online network in order to find the  Q  maximizing action  argmax_a Q(s_{t+1},a) . For these actions, use the corresponding next states and run the target network to calculate  Q(s_{t+1},argmax_a Q(s_{t+1},a)) .  In order to zero out the updates for the actions that were not played (resulting from zeroing the MSE loss), use the current states from the sampled batch, and run the online network to get the current Q values predictions. Set those values as the targets for the actions that were not actually played.    For each action that was played, use the following equation for calculating the targets of the network:\n     y_t=r(s_t,a_t )+\\gamma \\cdot Q(s_{t+1},argmax_a Q(s_{t+1},a))      Finally, train the online network using the current states as inputs, and with the aforementioned targets.    Once in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/value_optimization/dueling_dqn/index.html",
-            "text": "Dueling DQN\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nDueling Network Architectures for Deep Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nGeneral Description\n\n\nDueling DQN presents a change in the network structure comparing to DQN.\n\n\nDueling DQN uses a specialized \nDueling Q Head\n in order to separate \n Q \n to an \n A \n (advantage) stream and a \n V \n stream. Adding this type of structure to the network head allows the network to better differentiate actions from one another, and significantly improves the learning.\n\n\nIn many states, the values of the different actions are very similar, and it is less important which action to take.\nThis is especially important in environments where there are many actions to choose from. In DQN, on each training iteration, for each of the states in the batch, we update the \nQ\n values only for the specific actions taken in those states. This results in slower learning as we do not learn the \nQ\n values for actions that were not taken yet. On dueling architecture, on the other hand, learning is faster - as we start learning the state-value even if only a single action has been taken at this state.",
-            "title": "Dueling DQN"
-        },
-        {
-            "location": "/algorithms/value_optimization/dueling_dqn/index.html#dueling-dqn",
-            "text": "Actions space:  Discrete  References:   Dueling Network Architectures for Deep Reinforcement Learning",
-            "title": "Dueling DQN"
-        },
-        {
-            "location": "/algorithms/value_optimization/dueling_dqn/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/dueling_dqn/index.html#general-description",
-            "text": "Dueling DQN presents a change in the network structure comparing to DQN.  Dueling DQN uses a specialized  Dueling Q Head  in order to separate   Q   to an   A   (advantage) stream and a   V   stream. Adding this type of structure to the network head allows the network to better differentiate actions from one another, and significantly improves the learning.  In many states, the values of the different actions are very similar, and it is less important which action to take.\nThis is especially important in environments where there are many actions to choose from. In DQN, on each training iteration, for each of the states in the batch, we update the  Q  values only for the specific actions taken in those states. This results in slower learning as we do not learn the  Q  values for actions that were not taken yet. On dueling architecture, on the other hand, learning is faster - as we start learning the state-value even if only a single action has been taken at this state.",
-            "title": "General Description"
-        },
-        {
-            "location": "/algorithms/value_optimization/categorical_dqn/index.html",
-            "text": "Categorical DQN\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nA Distributional Perspective on Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\n\n\nSample a batch of transitions from the replay buffer. \n\n\n\n\nThe Bellman update is projected to the set of atoms representing the \n Q \n values distribution, such that the \ni-th\n component of the projected update is calculated as follows:\n   \n (\\Phi \\hat{T} Z_{\\theta}(s_t,a_t))_i=\\sum_{j=0}^{N-1}\\Big[1-\\frac{|[\\hat{T}_{z_{j}}]^{V_{MAX}}_{V_{MIN}}-z_i|}{\\Delta z}\\Big]^1_0 \\ p_j(s_{t+1}, \\pi(s_{t+1})) \n\n   where:\n\n\n\n\n\n\n[ \\cdot ] \n bounds its argument in the range [a, b]\n\n\n\n\n\\hat{T}_{z_{j}}\n is the Bellman update for atom \nz_j\n: \u00a0 \u00a0   \n\\hat{T}_{z_{j}} := r+\\gamma z_j\n\n\n\n\n\n\n\n\n\n\nNetwork is trained with the cross entropy loss between the resulting probability distribution and the target probability distribution.   Only the target of the actions that were actually taken is updated. \n\n\n\n\nOnce in every few thousand steps, weights are copied from the online network to the target network.",
-            "title": "Categorical DQN"
-        },
-        {
-            "location": "/algorithms/value_optimization/categorical_dqn/index.html#categorical-dqn",
-            "text": "Actions space:  Discrete  References:   A Distributional Perspective on Reinforcement Learning",
-            "title": "Categorical DQN"
-        },
-        {
-            "location": "/algorithms/value_optimization/categorical_dqn/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/categorical_dqn/index.html#algorithm-description",
-            "text": "Training the network   Sample a batch of transitions from the replay buffer.    The Bellman update is projected to the set of atoms representing the   Q   values distribution, such that the  i-th  component of the projected update is calculated as follows:\n     (\\Phi \\hat{T} Z_{\\theta}(s_t,a_t))_i=\\sum_{j=0}^{N-1}\\Big[1-\\frac{|[\\hat{T}_{z_{j}}]^{V_{MAX}}_{V_{MIN}}-z_i|}{\\Delta z}\\Big]^1_0 \\ p_j(s_{t+1}, \\pi(s_{t+1}))  \n   where:    [ \\cdot ]   bounds its argument in the range [a, b]   \\hat{T}_{z_{j}}  is the Bellman update for atom  z_j : \u00a0 \u00a0    \\hat{T}_{z_{j}} := r+\\gamma z_j      Network is trained with the cross entropy loss between the resulting probability distribution and the target probability distribution.   Only the target of the actions that were actually taken is updated.    Once in every few thousand steps, weights are copied from the online network to the target network.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/value_optimization/mmc/index.html",
-            "text": "Mixed Monte Carlo\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nCount-Based Exploration with Neural Density Models\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\nIn MMC, targets are calculated as a mixture between Double DQN targets and full Monte Carlo samples (total discounted returns).\n\n\nThe DDQN targets are calculated in the same manner as in the DDQN agent:\n\n\n\n\n y_t^{DDQN}=r(s_t,a_t )+\\gamma Q(s_{t+1},argmax_a Q(s_{t+1},a)) \n\n\n\n\nThe Monte Carlo targets are calculated by summing up the discounted rewards across the entire episode:\n\n\n\n\n y_t^{MC}=\\sum_{j=0}^T\\gamma^j r(s_{t+j},a_{t+j} ) \n\n\n\n\nA mixing ratio \n\\alpha\n is then used to get the final targets:\n\n\n\n\n y_t=(1-\\alpha)\\cdot y_t^{DDQN}+\\alpha \\cdot y_t^{MC} \n\n\n\n\nFinally, the online network is trained using the current states as inputs, and the calculated targets.\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "Mixed Monte Carlo"
-        },
-        {
-            "location": "/algorithms/value_optimization/mmc/index.html#mixed-monte-carlo",
-            "text": "Actions space:  Discrete  References:   Count-Based Exploration with Neural Density Models",
-            "title": "Mixed Monte Carlo"
-        },
-        {
-            "location": "/algorithms/value_optimization/mmc/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/mmc/index.html#algorithm-description",
-            "text": "Training the network  In MMC, targets are calculated as a mixture between Double DQN targets and full Monte Carlo samples (total discounted returns).  The DDQN targets are calculated in the same manner as in the DDQN agent:    y_t^{DDQN}=r(s_t,a_t )+\\gamma Q(s_{t+1},argmax_a Q(s_{t+1},a))    The Monte Carlo targets are calculated by summing up the discounted rewards across the entire episode:    y_t^{MC}=\\sum_{j=0}^T\\gamma^j r(s_{t+j},a_{t+j} )    A mixing ratio  \\alpha  is then used to get the final targets:    y_t=(1-\\alpha)\\cdot y_t^{DDQN}+\\alpha \\cdot y_t^{MC}    Finally, the online network is trained using the current states as inputs, and the calculated targets.\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/value_optimization/pal/index.html",
-            "text": "Persistent Advantage Learning\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nIncreasing the Action Gap: New Operators for Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\n\n\n\n\nSample a batch of transitions from the replay buffer. \n\n\n\n\n\n\nStart by calculating the initial target values in the same manner as they are calculated in DDQN\n   \n y_t^{DDQN}=r(s_t,a_t )+\\gamma Q(s_{t+1},argmax_a Q(s_{t+1},a)) \n\n\n\n\n\n\nThe action gap \n V(s_t )-Q(s_t,a_t) \n should then be subtracted from each of the calculated targets. To calculate the action gap, run the target network using the current states and get the \n Q \n values for all the actions. Then estimate \n V \n as the maximum predicted \n Q \n value for the current state:\n   \n V(s_t )=max_a Q(s_t,a) \n\n\n\n\nFor \nadvantage learning (AL)\n, reduce the action gap weighted by a predefined parameter \n \\alpha \n from the targets \n y_t^{DDQN} \n: \n   \n y_t=y_t^{DDQN}-\\alpha \\cdot (V(s_t )-Q(s_t,a_t )) \n\n\n\n\nFor \npersistent advantage learning (PAL)\n, the target network is also used in order to calculate the action gap for the next state:\n   \n V(s_{t+1} )-Q(s_{t+1},a_{t+1}) \n\n   where \n a_{t+1} \n is chosen by running the next states through the online network and choosing the action that has the highest predicted \n Q \n value. Finally, the targets will be defined as -\n   \n y_t=y_t^{DDQN}-\\alpha \\cdot min(V(s_t )-Q(s_t,a_t ),V(s_{t+1} )-Q(s_{t+1},a_{t+1} )) \n\n\n\n\n\n\nTrain the online network using the current states as inputs, and with the aforementioned targets.\n\n\n\n\n\n\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "Persistent Advantage Learning"
-        },
-        {
-            "location": "/algorithms/value_optimization/pal/index.html#persistent-advantage-learning",
-            "text": "Actions space:  Discrete  References:   Increasing the Action Gap: New Operators for Reinforcement Learning",
-            "title": "Persistent Advantage Learning"
-        },
-        {
-            "location": "/algorithms/value_optimization/pal/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/pal/index.html#algorithm-description",
-            "text": "Training the network    Sample a batch of transitions from the replay buffer.     Start by calculating the initial target values in the same manner as they are calculated in DDQN\n     y_t^{DDQN}=r(s_t,a_t )+\\gamma Q(s_{t+1},argmax_a Q(s_{t+1},a))     The action gap   V(s_t )-Q(s_t,a_t)   should then be subtracted from each of the calculated targets. To calculate the action gap, run the target network using the current states and get the   Q   values for all the actions. Then estimate   V   as the maximum predicted   Q   value for the current state:\n     V(s_t )=max_a Q(s_t,a)    For  advantage learning (AL) , reduce the action gap weighted by a predefined parameter   \\alpha   from the targets   y_t^{DDQN}  : \n     y_t=y_t^{DDQN}-\\alpha \\cdot (V(s_t )-Q(s_t,a_t ))    For  persistent advantage learning (PAL) , the target network is also used in order to calculate the action gap for the next state:\n     V(s_{t+1} )-Q(s_{t+1},a_{t+1})  \n   where   a_{t+1}   is chosen by running the next states through the online network and choosing the action that has the highest predicted   Q   value. Finally, the targets will be defined as -\n     y_t=y_t^{DDQN}-\\alpha \\cdot min(V(s_t )-Q(s_t,a_t ),V(s_{t+1} )-Q(s_{t+1},a_{t+1} ))     Train the online network using the current states as inputs, and with the aforementioned targets.    Once in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/value_optimization/nec/index.html",
-            "text": "Neural Episodic Control\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nNeural Episodic Control\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\n\n\nUse the current state as an input to the online network and extract the state embedding, which is the intermediate output from the middleware. \n\n\nFor each possible action \na_i\n, run the DND head using the state embedding and the selected action \na_i\n as inputs. The DND is queried and returns the \n P \n nearest neighbor keys and values. The keys and values are used to calculate and return the action \n Q \n value from the network. \n\n\nPass all the \n Q \n values to the exploration policy and choose an action accordingly. \n\n\nStore the state embeddings and actions taken during the current episode in a small buffer \nB\n, in order to accumulate transitions until it is possible to calculate the total discounted returns over the entire episode.\n\n\n\n\nFinalizing an episode\n\n\nFor each step in the episode, the state embeddings and the taken actions are stored in the buffer \nB\n. When the episode is finished, the replay buffer calculates the \n N \n-step total return of each transition in the buffer, bootstrapped using the maximum \nQ\n value of the \nN\n-th transition. Those values are inserted along with the total return into the DND, and the buffer \nB\n is reset.\n\n\nTraining the network\n\n\nTrain the network only when the DND has enough entries for querying.\n\n\nTo train the network, the current states are used as the inputs and the \nN\n-step returns are used as the targets. The \nN\n-step return used takes into account \n N \n consecutive steps, and bootstraps the last value from the network if necessary:\n\n y_t=\\sum_{j=0}^{N-1}\\gamma^j r(s_{t+j},a_{t+j} ) +\\gamma^N   max_a Q(s_{t+N},a)",
-            "title": "Neural Episodic Control"
-        },
-        {
-            "location": "/algorithms/value_optimization/nec/index.html#neural-episodic-control",
-            "text": "Actions space:  Discrete  References:   Neural Episodic Control",
-            "title": "Neural Episodic Control"
-        },
-        {
-            "location": "/algorithms/value_optimization/nec/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/nec/index.html#algorithm-description",
-            "text": "Choosing an action   Use the current state as an input to the online network and extract the state embedding, which is the intermediate output from the middleware.   For each possible action  a_i , run the DND head using the state embedding and the selected action  a_i  as inputs. The DND is queried and returns the   P   nearest neighbor keys and values. The keys and values are used to calculate and return the action   Q   value from the network.   Pass all the   Q   values to the exploration policy and choose an action accordingly.   Store the state embeddings and actions taken during the current episode in a small buffer  B , in order to accumulate transitions until it is possible to calculate the total discounted returns over the entire episode.   Finalizing an episode  For each step in the episode, the state embeddings and the taken actions are stored in the buffer  B . When the episode is finished, the replay buffer calculates the   N  -step total return of each transition in the buffer, bootstrapped using the maximum  Q  value of the  N -th transition. Those values are inserted along with the total return into the DND, and the buffer  B  is reset.  Training the network  Train the network only when the DND has enough entries for querying.  To train the network, the current states are used as the inputs and the  N -step returns are used as the targets. The  N -step return used takes into account   N   consecutive steps, and bootstraps the last value from the network if necessary:  y_t=\\sum_{j=0}^{N-1}\\gamma^j r(s_{t+j},a_{t+j} ) +\\gamma^N   max_a Q(s_{t+N},a)",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/value_optimization/bs_dqn/index.html",
-            "text": "Bootstrapped DQN\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nDeep Exploration via Bootstrapped DQN\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\nThe current states are used as the input to the network. The network contains several \nQ\n heads, which  are used for returning different estimations of the action \n Q \n values. For each episode, the bootstrapped exploration policy selects a single head to play with during the episode. According to the selected head, only the relevant output \n Q \n values are used. Using those \n Q \n values, the exploration policy then selects the action for acting.\n\n\nStoring the transitions\n\n\nFor each transition, a Binomial mask is generated according to a predefined probability, and the number of output heads. The mask is a binary vector where each element holds a 0 for heads that shouldn't train on the specific transition, and 1 for heads that should use the transition for training. The mask is stored as part of the transition info in the replay buffer. \n\n\nTraining the network\n\n\nFirst, sample a batch of transitions from the replay buffer. Run the current states through the network and get the current \n Q \n value predictions for all the heads and all the actions. For each transition in the batch, and for each output head, if the transition mask is 1 - change the targets of the played action to \ny_t\n, according to the standard DQN update rule:\n\n\n\n\n y_t=r(s_t,a_t )+\\gamma\\cdot max_a Q(s_{t+1},a) \n\n\n\n\nOtherwise, leave it intact so that the transition does not affect the learning of this head. Then, train the online network according to the calculated targets.\n\n\nAs in DQN, once in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "Bootstrapped DQN"
-        },
-        {
-            "location": "/algorithms/value_optimization/bs_dqn/index.html#bootstrapped-dqn",
-            "text": "Actions space:  Discrete  References:   Deep Exploration via Bootstrapped DQN",
-            "title": "Bootstrapped DQN"
-        },
-        {
-            "location": "/algorithms/value_optimization/bs_dqn/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/bs_dqn/index.html#algorithm-description",
-            "text": "Choosing an action  The current states are used as the input to the network. The network contains several  Q  heads, which  are used for returning different estimations of the action   Q   values. For each episode, the bootstrapped exploration policy selects a single head to play with during the episode. According to the selected head, only the relevant output   Q   values are used. Using those   Q   values, the exploration policy then selects the action for acting.  Storing the transitions  For each transition, a Binomial mask is generated according to a predefined probability, and the number of output heads. The mask is a binary vector where each element holds a 0 for heads that shouldn't train on the specific transition, and 1 for heads that should use the transition for training. The mask is stored as part of the transition info in the replay buffer.   Training the network  First, sample a batch of transitions from the replay buffer. Run the current states through the network and get the current   Q   value predictions for all the heads and all the actions. For each transition in the batch, and for each output head, if the transition mask is 1 - change the targets of the played action to  y_t , according to the standard DQN update rule:    y_t=r(s_t,a_t )+\\gamma\\cdot max_a Q(s_{t+1},a)    Otherwise, leave it intact so that the transition does not affect the learning of this head. Then, train the online network according to the calculated targets.  As in DQN, once in every few thousand steps, copy the weights from the online network to the target network.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/value_optimization/n_step/index.html",
-            "text": "N-Step Q Learning\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nAsynchronous Methods for Deep Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\nThe \nN\n-step Q learning algorithm works in similar manner to DQN except for the following changes:\n\n\n\n\n\n\nNo replay buffer is used. Instead of sampling random batches of transitions, the network is trained every \nN\n steps using the latest \nN\n steps played by the agent.\n\n\n\n\n\n\nIn order to stabilize the learning, multiple workers work together to update the network. This creates the same effect as uncorrelating the samples used for training.\n\n\n\n\n\n\nInstead of using single-step Q targets for the network, the rewards from \nN\n consequent steps are accumulated to form the \nN\n-step Q targets, according to the following equation: \n\nR(s_t, a_t) = \\sum_{i=t}^{i=t + k - 1} \\gamma^{i-t}r_i +\\gamma^{k} V(s_{t+k})\n\nwhere \nk\n is \nT_{max} - State\\_Index\n for each state in the batch",
-            "title": "N-Step Q Learning"
-        },
-        {
-            "location": "/algorithms/value_optimization/n_step/index.html#n-step-q-learning",
-            "text": "Actions space:  Discrete  References:   Asynchronous Methods for Deep Reinforcement Learning",
-            "title": "N-Step Q Learning"
-        },
-        {
-            "location": "/algorithms/value_optimization/n_step/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/n_step/index.html#algorithm-description",
-            "text": "Training the network  The  N -step Q learning algorithm works in similar manner to DQN except for the following changes:    No replay buffer is used. Instead of sampling random batches of transitions, the network is trained every  N  steps using the latest  N  steps played by the agent.    In order to stabilize the learning, multiple workers work together to update the network. This creates the same effect as uncorrelating the samples used for training.    Instead of using single-step Q targets for the network, the rewards from  N  consequent steps are accumulated to form the  N -step Q targets, according to the following equation:  R(s_t, a_t) = \\sum_{i=t}^{i=t + k - 1} \\gamma^{i-t}r_i +\\gamma^{k} V(s_{t+k}) \nwhere  k  is  T_{max} - State\\_Index  for each state in the batch",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/value_optimization/naf/index.html",
-            "text": "Normalized Advantage Functions\n\n\nActions space:\n Continuous\n\n\nReferences:\n \nContinuous Deep Q-Learning with Model-based Acceleration\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\nThe current state is used as an input to the network. The action mean \n \\mu(s_t ) \n is extracted from the output head. It is then passed to the exploration policy which adds noise in order to encourage exploration.\n\n\nTraining the network\n\n\nThe network is trained by using the following targets:\n\n y_t=r(s_t,a_t )+\\gamma\\cdot V(s_{t+1}) \n\nUse the next states as the inputs to the target network and extract the \n V \n value, from within the head, to get \n V(s_{t+1} ) \n. Then, update the online network using the current states and actions as inputs, and \n y_t \n as the targets.\nAfter every training step, use a soft update in order to copy the weights from the online network to the target network.",
-            "title": "Normalized Advantage Functions"
-        },
-        {
-            "location": "/algorithms/value_optimization/naf/index.html#normalized-advantage-functions",
-            "text": "Actions space:  Continuous  References:   Continuous Deep Q-Learning with Model-based Acceleration",
-            "title": "Normalized Advantage Functions"
-        },
-        {
-            "location": "/algorithms/value_optimization/naf/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/value_optimization/naf/index.html#algorithm-description",
-            "text": "Choosing an action  The current state is used as an input to the network. The action mean   \\mu(s_t )   is extracted from the output head. It is then passed to the exploration policy which adds noise in order to encourage exploration.  Training the network  The network is trained by using the following targets:  y_t=r(s_t,a_t )+\\gamma\\cdot V(s_{t+1})  \nUse the next states as the inputs to the target network and extract the   V   value, from within the head, to get   V(s_{t+1} )  . Then, update the online network using the current states and actions as inputs, and   y_t   as the targets.\nAfter every training step, use a soft update in order to copy the weights from the online network to the target network.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/policy_optimization/pg/index.html",
-            "text": "Policy Gradient\n\n\nActions space:\n Discrete|Continuous\n\n\nReferences:\n \nSimple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action - Discrete actions\n\n\nRun the current states through the network and get a policy distribution over the actions. While training, sample from the policy distribution. When testing, take the action with the highest probability. \n\n\nTraining the network\n\n\nThe policy head loss is defined as \n L=-log (\\pi) \\cdot  PolicyGradientRescaler \n. The \nPolicyGradientRescaler\n is used in order to reduce the policy gradient variance, which might be very noisy. This is done in order to reduce the variance of the updates, since noisy gradient updates might destabilize the policy's convergence. The rescaler is a configurable parameter and there are few options to choose from:  \n\n\n \nTotal Episode Return\n - The sum of all the discounted rewards during the episode.\n\n \nFuture Return\n - Return from each transition until the end of the episode.\n\n \nFuture Return Normalized by Episode\n - Future returns across the episode normalized by the episode's mean and standard deviation.\n\n \nFuture Return Normalized by Timestep\n - Future returns normalized using running means and standard deviations, which are calculated seperately for each timestep, across different episodes. \n\n\nGradients are accumulated over a number of full played episodes. The gradients accumulation over several episodes serves the same purpose - reducing the update variance. After accumulating gradients for several episodes, the gradients are then applied to the network.",
-            "title": "Policy Gradient"
-        },
-        {
-            "location": "/algorithms/policy_optimization/pg/index.html#policy-gradient",
-            "text": "Actions space:  Discrete|Continuous  References:   Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning",
-            "title": "Policy Gradient"
-        },
-        {
-            "location": "/algorithms/policy_optimization/pg/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/policy_optimization/pg/index.html#algorithm-description",
-            "text": "Choosing an action - Discrete actions  Run the current states through the network and get a policy distribution over the actions. While training, sample from the policy distribution. When testing, take the action with the highest probability.   Training the network  The policy head loss is defined as   L=-log (\\pi) \\cdot  PolicyGradientRescaler  . The  PolicyGradientRescaler  is used in order to reduce the policy gradient variance, which might be very noisy. This is done in order to reduce the variance of the updates, since noisy gradient updates might destabilize the policy's convergence. The rescaler is a configurable parameter and there are few options to choose from:      Total Episode Return  - The sum of all the discounted rewards during the episode.   Future Return  - Return from each transition until the end of the episode.   Future Return Normalized by Episode  - Future returns across the episode normalized by the episode's mean and standard deviation.   Future Return Normalized by Timestep  - Future returns normalized using running means and standard deviations, which are calculated seperately for each timestep, across different episodes.   Gradients are accumulated over a number of full played episodes. The gradients accumulation over several episodes serves the same purpose - reducing the update variance. After accumulating gradients for several episodes, the gradients are then applied to the network.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ac/index.html",
-            "text": "Actor-Critic\n\n\nActions space:\n Discrete|Continuous\n\n\nReferences:\n \nAsynchronous Methods for Deep Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action - Discrete actions\n\n\nThe policy network is used in order to predict action probabilites. While training, a sample is taken from a categorical distribution assigned with these probabilities. When testing, the action with the highest probability is used.\n\n\nTraining the network\n\n\nA batch of \n T_{max} \n transitions is used, and the advantages are calculated upon it.\n\n\nAdvantages can be calculated by either of the following methods (configured by the selected preset) -\n\n\n\n\nA_VALUE\n - Estimating advantage directly:\n A(s_t, a_t) = \\underbrace{\\sum_{i=t}^{i=t + k - 1} \\gamma^{i-t}r_i +\\gamma^{k} V(s_{t+k})}_{Q(s_t, a_t)} - V(s_t) \nwhere \nk\n is \nT_{max} - State\\_Index\n for each state in the batch.\n\n\nGAE\n - By following the \nGeneralized Advantage Estimation\n paper. \n\n\n\n\nThe advantages are then used in order to accumulate gradients according to \n\n L = -\\mathop{\\mathbb{E}} [log (\\pi) \\cdot A]",
-            "title": "Actor-Critic"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ac/index.html#actor-critic",
-            "text": "Actions space:  Discrete|Continuous  References:   Asynchronous Methods for Deep Reinforcement Learning",
-            "title": "Actor-Critic"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ac/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ac/index.html#algorithm-description",
-            "text": "Choosing an action - Discrete actions  The policy network is used in order to predict action probabilites. While training, a sample is taken from a categorical distribution assigned with these probabilities. When testing, the action with the highest probability is used.  Training the network  A batch of   T_{max}   transitions is used, and the advantages are calculated upon it.  Advantages can be calculated by either of the following methods (configured by the selected preset) -   A_VALUE  - Estimating advantage directly:  A(s_t, a_t) = \\underbrace{\\sum_{i=t}^{i=t + k - 1} \\gamma^{i-t}r_i +\\gamma^{k} V(s_{t+k})}_{Q(s_t, a_t)} - V(s_t)  where  k  is  T_{max} - State\\_Index  for each state in the batch.  GAE  - By following the  Generalized Advantage Estimation  paper.    The advantages are then used in order to accumulate gradients according to   L = -\\mathop{\\mathbb{E}} [log (\\pi) \\cdot A]",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ddpg/index.html",
-            "text": "Deep Deterministic Policy Gradient\n\n\nActions space:\n Continuous\n\n\nReferences:\n \nContinuous control with deep reinforcement learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\nPass the current states through the actor network, and get an action mean vector \n \\mu \n. While in training phase, use a continuous exploration policy, such as the Ornstein-Uhlenbeck process, to add exploration noise to the action. When testing, use the mean vector \n\\mu\n as-is.\n\n\nTraining the network\n\n\nStart by sampling a batch of transitions from the experience replay.\n\n\n\n\nTo train the \ncritic network\n, use the following targets:\n\n\n\n\n\n\n y_t=r(s_t,a_t )+\\gamma \\cdot Q(s_{t+1},\\mu(s_{t+1} )) \n\n  First run the actor target network, using the next states as the inputs, and get \n \\mu (s_{t+1} ) \n. Next, run the critic target network using the next states and \n \\mu (s_{t+1} ) \n, and use the output to calculate \n y_t \n according to the equation above. To train the network, use the current states and actions as the inputs, and \ny_t\n as the targets.\n\n\n\n\nTo train the \nactor network\n, use the following equation:\n\n\n\n\n\n\n \\nabla_{\\theta^\\mu } J \\approx E_{s_t \\tilde{} \\rho^\\beta } [\\nabla_a Q(s,a)|_{s=s_t,a=\\mu (s_t ) } \\cdot \\nabla_{\\theta^\\mu} \\mu(s)|_{s=s_t} ] \n\n  Use the actor's online network to get the action mean values using the current states as the inputs. Then, use the critic online network in order to get the gradients of the critic output with respect to the action mean values \n \\nabla _a Q(s,a)|_{s=s_t,a=\\mu(s_t ) } \n. Using the chain rule, calculate the gradients of the actor's output, with respect to the actor weights, given \n \\nabla_a Q(s,a) \n. Finally, apply those gradients to the actor network.\n\n\nAfter every training step, do a soft update of the critic and actor target networks' weights from the online networks.",
-            "title": "Deep Determinstic Policy Gradients"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ddpg/index.html#deep-deterministic-policy-gradient",
-            "text": "Actions space:  Continuous  References:   Continuous control with deep reinforcement learning",
-            "title": "Deep Deterministic Policy Gradient"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ddpg/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ddpg/index.html#algorithm-description",
-            "text": "Choosing an action  Pass the current states through the actor network, and get an action mean vector   \\mu  . While in training phase, use a continuous exploration policy, such as the Ornstein-Uhlenbeck process, to add exploration noise to the action. When testing, use the mean vector  \\mu  as-is.  Training the network  Start by sampling a batch of transitions from the experience replay.   To train the  critic network , use the following targets:     y_t=r(s_t,a_t )+\\gamma \\cdot Q(s_{t+1},\\mu(s_{t+1} ))  \n  First run the actor target network, using the next states as the inputs, and get   \\mu (s_{t+1} )  . Next, run the critic target network using the next states and   \\mu (s_{t+1} )  , and use the output to calculate   y_t   according to the equation above. To train the network, use the current states and actions as the inputs, and  y_t  as the targets.   To train the  actor network , use the following equation:     \\nabla_{\\theta^\\mu } J \\approx E_{s_t \\tilde{} \\rho^\\beta } [\\nabla_a Q(s,a)|_{s=s_t,a=\\mu (s_t ) } \\cdot \\nabla_{\\theta^\\mu} \\mu(s)|_{s=s_t} ]  \n  Use the actor's online network to get the action mean values using the current states as the inputs. Then, use the critic online network in order to get the gradients of the critic output with respect to the action mean values   \\nabla _a Q(s,a)|_{s=s_t,a=\\mu(s_t ) }  . Using the chain rule, calculate the gradients of the actor's output, with respect to the actor weights, given   \\nabla_a Q(s,a)  . Finally, apply those gradients to the actor network.  After every training step, do a soft update of the critic and actor target networks' weights from the online networks.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ppo/index.html",
-            "text": "Proximal Policy Optimization\n\n\nActions space:\n Discrete|Continuous\n\n\nReferences:\n \nProximal Policy Optimization Algorithms\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action - Continuous actions\n\n\nRun the observation through the policy network, and get the mean and standard deviation vectors for this observation. While in training phase, sample from a multi-dimensional Gaussian distribution with these mean and standard deviation values. When testing, just take the mean values predicted by the network. \n\n\nTraining the network\n\n\n\n\nCollect a big chunk of experience (in the order of thousands of transitions, sampled from multiple episodes).\n\n\nCalculate the advantages for each transition, using the \nGeneralized Advantage Estimation\n method (Schulman '2015).  \n\n\nRun a single training iteration of the value network using an L-BFGS optimizer. Unlike first order optimizers, the L-BFGS optimizer runs on the entire dataset at once, without batching. It continues running until some low loss threshold is reached. To prevent overfitting to the current dataset, the value targets are updated in a soft manner, using an Exponentially Weighted Moving Average, based on the total discounted returns of each state in each episode.\n\n\nRun several training iterations of the policy network. This is done by using the previously calculated advantages as targets. The loss function penalizes policies that deviate too far from the old policy (the policy that was used \nbefore\n starting to run the current set of training iterations) using a regularization term. \n\n\nAfter training is done, the last sampled KL divergence value will be compared with the \ntarget KL divergence\n value, in order to adapt the penalty coefficient used in the policy loss. If the KL divergence went too high, increase the penalty, if it went too low, reduce it. Otherwise, leave it unchanged.",
-            "title": "Proximal Policy Optimization"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ppo/index.html#proximal-policy-optimization",
-            "text": "Actions space:  Discrete|Continuous  References:   Proximal Policy Optimization Algorithms",
-            "title": "Proximal Policy Optimization"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ppo/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/policy_optimization/ppo/index.html#algorithm-description",
-            "text": "Choosing an action - Continuous actions  Run the observation through the policy network, and get the mean and standard deviation vectors for this observation. While in training phase, sample from a multi-dimensional Gaussian distribution with these mean and standard deviation values. When testing, just take the mean values predicted by the network.   Training the network   Collect a big chunk of experience (in the order of thousands of transitions, sampled from multiple episodes).  Calculate the advantages for each transition, using the  Generalized Advantage Estimation  method (Schulman '2015).    Run a single training iteration of the value network using an L-BFGS optimizer. Unlike first order optimizers, the L-BFGS optimizer runs on the entire dataset at once, without batching. It continues running until some low loss threshold is reached. To prevent overfitting to the current dataset, the value targets are updated in a soft manner, using an Exponentially Weighted Moving Average, based on the total discounted returns of each state in each episode.  Run several training iterations of the policy network. This is done by using the previously calculated advantages as targets. The loss function penalizes policies that deviate too far from the old policy (the policy that was used  before  starting to run the current set of training iterations) using a regularization term.   After training is done, the last sampled KL divergence value will be compared with the  target KL divergence  value, in order to adapt the penalty coefficient used in the policy loss. If the KL divergence went too high, increase the penalty, if it went too low, reduce it. Otherwise, leave it unchanged.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/policy_optimization/cppo/index.html",
-            "text": "Clipped Proximal Policy Optimization\n\n\nActions space:\n Discrete|Continuous\n\n\nReferences:\n \nProximal Policy Optimization Algorithms\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action - Continuous action\n\n\nSame as in PPO. \n\n\nTraining the network\n\n\nVery similar to PPO, with several small (but very simplifying) changes:\n\n\n\n\n\n\nTrain both the value and policy networks, simultaneously, by defining a single loss function, which is the sum of each of the networks loss functions. Then, back propagate gradients only once from this unified loss function.\n\n\n\n\n\n\nThe unified network's optimizer is set to Adam (instead of L-BFGS for the value network as in PPO). \n\n\n\n\n\n\nValue targets are now also calculated based on the GAE advantages. In this method, the \n V \n values are predicted from the critic network, and then added to the GAE based advantages, in order to get a \n Q \n value for each action. Now, since our critic network is predicting a \n V \n value for each state, setting the \n Q \n calculated action-values as a target, will on average serve as a \n V \n state-value target.  \n\n\n\n\n\n\nInstead of adapting the penalizing KL divergence coefficient used in PPO, the likelihood ratio \nr_t(\\theta) =\\frac{\\pi_{\\theta}(a|s)}{\\pi_{\\theta_{old}}(a|s)}\n is clipped, to achieve a similar effect. This is done by defining the policy's loss function to be the minimum between the standard surrogate loss and an epsilon clipped surrogate loss:\n\n\n\n\n\n\n\n\nL^{CLIP}(\\theta)=E_{t}[min(r_t(\\theta)\\cdot \\hat{A}_t, clip(r_t(\\theta), 1-\\epsilon, 1+\\epsilon) \\cdot \\hat{A}_t)]",
-            "title": "Clipped Proximal Policy Optimization"
-        },
-        {
-            "location": "/algorithms/policy_optimization/cppo/index.html#clipped-proximal-policy-optimization",
-            "text": "Actions space:  Discrete|Continuous  References:   Proximal Policy Optimization Algorithms",
-            "title": "Clipped Proximal Policy Optimization"
-        },
-        {
-            "location": "/algorithms/policy_optimization/cppo/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/policy_optimization/cppo/index.html#algorithm-description",
-            "text": "Choosing an action - Continuous action  Same as in PPO.   Training the network  Very similar to PPO, with several small (but very simplifying) changes:    Train both the value and policy networks, simultaneously, by defining a single loss function, which is the sum of each of the networks loss functions. Then, back propagate gradients only once from this unified loss function.    The unified network's optimizer is set to Adam (instead of L-BFGS for the value network as in PPO).     Value targets are now also calculated based on the GAE advantages. In this method, the   V   values are predicted from the critic network, and then added to the GAE based advantages, in order to get a   Q   value for each action. Now, since our critic network is predicting a   V   value for each state, setting the   Q   calculated action-values as a target, will on average serve as a   V   state-value target.      Instead of adapting the penalizing KL divergence coefficient used in PPO, the likelihood ratio  r_t(\\theta) =\\frac{\\pi_{\\theta}(a|s)}{\\pi_{\\theta_{old}}(a|s)}  is clipped, to achieve a similar effect. This is done by defining the policy's loss function to be the minimum between the standard surrogate loss and an epsilon clipped surrogate loss:     L^{CLIP}(\\theta)=E_{t}[min(r_t(\\theta)\\cdot \\hat{A}_t, clip(r_t(\\theta), 1-\\epsilon, 1+\\epsilon) \\cdot \\hat{A}_t)]",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/other/dfp/index.html",
-            "text": "Direct Future Prediction\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nLearning to Act by Predicting the Future\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\n\n\nThe current states (observations and measurements) and the corresponding goal vector are passed as an input to the network. The output of the network is the predicted future measurements for time-steps \nt+1,t+2,t+4,t+8,t+16\n and \nt+32\n for each possible action. \n\n\nFor each action, the measurements of each predicted time-step are multiplied by the goal vector, and the result is a single vector of future values for each action. \n\n\nThen, a weighted sum of the future values of each action is calculated, and the result is a single value for each action. \n\n\nThe action values are passed to the exploration policy to decide on the action to use.\n\n\n\n\nTraining the network\n\n\nGiven a batch of transitions, run them through the network to get the current predictions of the future measurements per action, and set them as the initial targets for training the network. For each transition \n(s_t,a_t,r_t,s_{t+1} )\n in the batch, the target of the network for the action that was taken, is the actual measurements that were seen in time-steps \nt+1,t+2,t+4,t+8,t+16\n and \nt+32\n. For the actions that were not taken, the targets are the current values.",
-            "title": "Direct Future Prediction"
-        },
-        {
-            "location": "/algorithms/other/dfp/index.html#direct-future-prediction",
-            "text": "Actions space:  Discrete  References:   Learning to Act by Predicting the Future",
-            "title": "Direct Future Prediction"
-        },
-        {
-            "location": "/algorithms/other/dfp/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/other/dfp/index.html#algorithm-description",
-            "text": "Choosing an action   The current states (observations and measurements) and the corresponding goal vector are passed as an input to the network. The output of the network is the predicted future measurements for time-steps  t+1,t+2,t+4,t+8,t+16  and  t+32  for each possible action.   For each action, the measurements of each predicted time-step are multiplied by the goal vector, and the result is a single vector of future values for each action.   Then, a weighted sum of the future values of each action is calculated, and the result is a single value for each action.   The action values are passed to the exploration policy to decide on the action to use.   Training the network  Given a batch of transitions, run them through the network to get the current predictions of the future measurements per action, and set them as the initial targets for training the network. For each transition  (s_t,a_t,r_t,s_{t+1} )  in the batch, the target of the network for the action that was taken, is the actual measurements that were seen in time-steps  t+1,t+2,t+4,t+8,t+16  and  t+32 . For the actions that were not taken, the targets are the current values.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/algorithms/imitation/bc/index.html",
-            "text": "Behavioral Cloning\n\n\nActions space:\n Discrete|Continuous\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\nThe replay buffer contains the expert demonstrations for the task.\nThese demonstrations are given as state, action tuples, and with no reward.\nThe training goal is to reduce the difference between the actions predicted by the network and the actions taken by the expert for each state.\n\n\n\n\nSample a batch of transitions from the replay buffer.\n\n\nUse the current states as input to the network, and the expert actions as the targets of the network.\n\n\nThe loss function for the network is MSE, and therefore we use the Q head to minimize this loss.",
-            "title": "Behavioral Cloning"
-        },
-        {
-            "location": "/algorithms/imitation/bc/index.html#behavioral-cloning",
-            "text": "Actions space:  Discrete|Continuous",
-            "title": "Behavioral Cloning"
-        },
-        {
-            "location": "/algorithms/imitation/bc/index.html#network-structure",
-            "text": "",
-            "title": "Network Structure"
-        },
-        {
-            "location": "/algorithms/imitation/bc/index.html#algorithm-description",
-            "text": "Training the network  The replay buffer contains the expert demonstrations for the task.\nThese demonstrations are given as state, action tuples, and with no reward.\nThe training goal is to reduce the difference between the actions predicted by the network and the actions taken by the expert for each state.   Sample a batch of transitions from the replay buffer.  Use the current states as input to the network, and the expert actions as the targets of the network.  The loss function for the network is MSE, and therefore we use the Q head to minimize this loss.",
-            "title": "Algorithm Description"
-        },
-        {
-            "location": "/dashboard/index.html",
-            "text": "Reinforcement learning algorithms are neat. That is - when they work. But when they don't, RL algorithms are often quite tricky to debug. \n\n\nFinding the root cause for why things break in RL is rather difficult. Moreover, different RL algorithms shine in some aspects, but then lack on other. Comparing the algorithms faithfully is also a hard task, which requires the right tools.\n\n\nCoach Dashboard is a visualization tool which simplifies the analysis of the training process. Each run of Coach extracts a lot of information from within the algorithm and stores it in the experiment directory. This information is very valuable for debugging, analyzing and comparing different algorithms. But without a good visualization tool, this information can not be utilized. This is where Coach Dashboard takes place.\n\n\nVisualizing Signals\n\n\nCoach Dashboard exposes a convenient user interface for visualizing the training signals. The signals are dynamically updated - during the agent training. Additionaly, it allows selecting a subset of the available signals, and then overlaying them on top of each other.  \n\n\n\n\n\n\n\n\n\n\n\n\n\nHolding the CTRL key, while selecting signals, will allow visualizing more than one signal. \n\n\nSignals can be visualized, using either of the Y-axes, in order to visualize signals with different scales. To move a signal to the second Y-axis, select it and press the 'Toggle Second Axis' button.\n\n\n\n\nTracking Statistics\n\n\nWhen running parallel algorithms, such as A3C, it often helps visualizing the learning of all the workers, at the same time. Coach Dashboard allows viewing multiple signals (and even smooth them out, if required) from multiple workers. In addition, it supports viewing the mean and standard deviation of the same signal, across different workers, using Bollinger bands.  \n\n\n\n\n\n\n\n\n\n    \n\n    \nDisplaying Bollinger Bands\n\n\n\n\n\n    \n\n    \nDisplaying All The Workers\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nComparing Runs\n\n\nReinforcement learning algorithms are notoriously known as unstable, and suffer from high run-to-run variance. This makes benchmarking and comparing different algorithms even harder. To ease this process, it is common to execute several runs of the same algorithm and average over them. This is easy to do with Coach Dashboard, by centralizing all the experiment directories in a single directory, and then loading them as a single group. Loading several groups of different algorithms then allows comparing the averaged signals, such as the total episode reward.  \n\n\nIn RL, there are several interesting performance metrics to consider, and this is easy to do by controlling the X-axis units in Coach Dashboard. It is possible to switch between several options such as the total number of steps or the total training time.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nComparing Several Algorithms According to the Time Passed\n\n\n\n\n\n\n\n\n\n\n\n\n\nComparing Several Algorithms According to the Number of Episodes Played",
-            "title": "Coach Dashboard"
-        },
-        {
-            "location": "/contributing/add_agent/index.html",
-            "text": "Coach's modularity makes adding an agent a simple and clean task, that involves the following steps:\n\n\n\n\n\n\nImplement your algorithm in a new file under the agents directory. The agent can inherit base classes such as \nValueOptimizationAgent\n or \nActorCriticAgent\n, or the more generic \nAgent\n base class.\n\n\n\n\n\n\nValueOptimizationAgent\n, \nPolicyOptimizationAgent\n and \nAgent\n are abstract classes. \nlearn_from_batch() should be overriden with the desired behavior for the algorithm being implemented. If deciding to inherit from \nAgent\n, also choose_action() should be overriden.       \n\n\ndef learn_from_batch(self, batch):\n    \"\"\"\n    Given a batch of transitions, calculates their target values and updates the network.\n    :param batch: A list of transitions\n    :return: The loss of the training\n    \"\"\"\n    pass\n\ndef choose_action(self, curr_state, phase=RunPhase.TRAIN):\n    \"\"\"\n    choose an action to act with in the current episode being played. Different behavior might be exhibited when training\n     or testing.\n\n    :param curr_state: the current state to act upon.  \n    :param phase: the current phase: training or testing.\n    :return: chosen action, some action value describing the action (q-value, probability, etc)\n    \"\"\"\n    pass\n\n\n\n\n\n\n\nMake sure to add your new agent to \nagents/__init__.py\n\n\n\n\n\n\n\n\n\n\nImplement your agent's specific network head, if needed, at the implementation for the framework of your choice. For example \narchitectures/neon_components/heads.py\n. The head will inherit the generic base class Head.\n    A new output type should be added to configurations.py, and a mapping between the new head and output type should be defined in the get_output_head() function at \narchitectures/neon_components/general_network.py\n\n\n\n\nDefine a new configuration class at configurations.py, which includes the new agent name in the \ntype\n field, the new output type in the \noutput_types\n field, and assigning default values to hyperparameters.\n\n\n(Optional) Define a preset using the new agent type with a given environment, and the hyperparameters that should be used for training on that environment.",
-            "title": "Adding a New Agent"
-        },
-        {
-            "location": "/contributing/add_env/index.html",
-            "text": "Adding a new environment to Coach is as easy as solving CartPole. \n\n\nThere are a few simple steps to follow, and we will walk through them one by one.\n\n\n\n\n\n\nCoach defines a simple API for implementing a new environment which is defined in environment/environment_wrapper.py.\n    There are several functions to implement, but only some of them are mandatory. \n\n\nHere are the important ones:\n\n\n    def _take_action(self, action_idx):\n        \"\"\"\n        An environment dependent function that sends an action to the simulator.\n        :param action_idx: the action to perform on the environment.\n        :return: None\n        \"\"\"\n        pass\n\n    def _preprocess_observation(self, observation):\n        \"\"\"\n        Do initial observation preprocessing such as cropping, rgb2gray, rescale etc.\n        Implementing this function is optional.\n        :param observation: a raw observation from the environment\n        :return: the preprocessed observation\n        \"\"\"\n        return observation\n\n    def _update_state(self):\n        \"\"\"\n        Updates the state from the environment.\n        Should update self.observation, self.reward, self.done, self.measurements and self.info\n        :return: None\n        \"\"\"\n        pass\n\n    def _restart_environment_episode(self, force_environment_reset=False):\n        \"\"\"\n        :param force_environment_reset: Force the environment to reset even if the episode is not done yet.\n        :return:\n        \"\"\"\n        pass\n\n    def get_rendered_image(self):\n        \"\"\"\n        Return a numpy array containing the image that will be rendered to the screen.\n        This can be different from the observation. For example, mujoco's observation is a measurements vector.\n        :return: numpy array containing the image that will be rendered to the screen\n        \"\"\"\n        return self.observation\n\n\n\n\n\n\n\nMake sure to import the environment in environments/__init__.py:\n\n\nfrom doom_environment_wrapper import *\n\n\n\nAlso, a new entry should be added to the EnvTypes enum mapping the environment name to the wrapper's class name:\n\n\nDoom = \"DoomEnvironmentWrapper\"\n\n\n\n\n\n\n\nIn addition a new configuration class should be implemented for defining the environment's parameters and placed in configurations.py. \nFor instance, the following is used for Doom:\n\n\nclass Doom(EnvironmentParameters):\n    type = 'Doom'\n    frame_skip = 4\n    observation_stack_size = 3\n    desired_observation_height = 60\n    desired_observation_width = 76\n\n\n\n\n\n\n\nAnd that's it, you're done. Now just add a new preset with your newly created environment, and start training an agent on top of it.",
-            "title": "Adding a New Environment"
-        }
-    ]
-}
\ No newline at end of file
diff --git a/docs/search.html b/docs/search.html
index 2afd6dd..b8506f0 100644
--- a/docs/search.html
+++ b/docs/search.html
@@ -3,31 +3,22 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="./img/favicon.ico">
-
-  
+  <title>Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="./css/theme.css" type="text/css" />
   <link rel="stylesheet" href="./css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="./css/highlight.css">
   <link href="./extra.css" rel="stylesheet">
-
   
   <script src="./js/jquery-2.1.1.min.js"></script>
   <script src="./js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="./js/highlight.pack.js"></script>
-  <script src="./js/theme.js"></script>
-  <script>var base_url = '.';</script>
-  <script data-main="./mkdocs/js/search.js" src="./mkdocs/js/require.js"></script>
-
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="./js/highlight.pack.js"></script> 
   
 </head>
 
@@ -38,7 +29,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="./index.html" class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href="." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="./search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -47,184 +38,136 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="./index.html">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href=".">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="design/index.html">Design</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="usage/">Usage</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="usage/index.html">Usage</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/dqn/index.html">DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="algorithms/imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="contributing/add_agent/index.html">Adding a New Agent</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
     </ul>
-<li>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -236,7 +179,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="./index.html">Reinforcement Learning Coach Documentation</a>
+        <a href=".">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -244,7 +187,7 @@
         <div class="rst-content">
           <div role="navigation" aria-label="breadcrumbs navigation">
   <ul class="wy-breadcrumbs">
-    <li><a href="./index.html">Docs</a> &raquo;</li>
+    <li><a href=".">Docs</a> &raquo;</li>
     
     
     <li class="wy-breadcrumbs-aside">
@@ -264,8 +207,8 @@
     <input name="q" id="mkdocs-search-query" type="text" class="search_input search-query ui-autocomplete-input" placeholder="Search the Docs" autocomplete="off" autofocus>
   </form>
 
-  <div id="mkdocs-search-results">
-    Sorry, page not found.
+  <div id="mkdocs-search-results" class="search-results">
+    Searching...
   </div>
 
 
@@ -283,7 +226,7 @@
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -291,13 +234,18 @@
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
       
     </span>
 </div>
+    <script>var base_url = '.';</script>
+    <script src="./js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="./search/require.js"></script>
+      <script src="./search/search.js"></script>
 
 </body>
 </html>
diff --git a/docs/search/lunr.min.js b/docs/search/lunr.min.js
new file mode 100644
index 0000000..b0198df
--- /dev/null
+++ b/docs/search/lunr.min.js
@@ -0,0 +1,7 @@
+/**
+ * lunr - http://lunrjs.com - A bit like Solr, but much smaller and not as bright - 0.7.0
+ * Copyright (C) 2016 Oliver Nightingale
+ * MIT Licensed
+ * @license
+ */
+!function(){var t=function(e){var n=new t.Index;return n.pipeline.add(t.trimmer,t.stopWordFilter,t.stemmer),e&&e.call(n,n),n};t.version="0.7.0",t.utils={},t.utils.warn=function(t){return function(e){t.console&&console.warn&&console.warn(e)}}(this),t.utils.asString=function(t){return void 0===t||null===t?"":t.toString()},t.EventEmitter=function(){this.events={}},t.EventEmitter.prototype.addListener=function(){var t=Array.prototype.slice.call(arguments),e=t.pop(),n=t;if("function"!=typeof e)throw new TypeError("last argument must be a function");n.forEach(function(t){this.hasHandler(t)||(this.events[t]=[]),this.events[t].push(e)},this)},t.EventEmitter.prototype.removeListener=function(t,e){if(this.hasHandler(t)){var n=this.events[t].indexOf(e);this.events[t].splice(n,1),this.events[t].length||delete this.events[t]}},t.EventEmitter.prototype.emit=function(t){if(this.hasHandler(t)){var e=Array.prototype.slice.call(arguments,1);this.events[t].forEach(function(t){t.apply(void 0,e)})}},t.EventEmitter.prototype.hasHandler=function(t){return t in this.events},t.tokenizer=function(e){return arguments.length&&null!=e&&void 0!=e?Array.isArray(e)?e.map(function(e){return t.utils.asString(e).toLowerCase()}):e.toString().trim().toLowerCase().split(t.tokenizer.seperator):[]},t.tokenizer.seperator=/[\s\-]+/,t.tokenizer.load=function(t){var e=this.registeredFunctions[t];if(!e)throw new Error("Cannot load un-registered function: "+t);return e},t.tokenizer.label="default",t.tokenizer.registeredFunctions={"default":t.tokenizer},t.tokenizer.registerFunction=function(e,n){n in this.registeredFunctions&&t.utils.warn("Overwriting existing tokenizer: "+n),e.label=n,this.registeredFunctions[n]=e},t.Pipeline=function(){this._stack=[]},t.Pipeline.registeredFunctions={},t.Pipeline.registerFunction=function(e,n){n in this.registeredFunctions&&t.utils.warn("Overwriting existing registered function: "+n),e.label=n,t.Pipeline.registeredFunctions[e.label]=e},t.Pipeline.warnIfFunctionNotRegistered=function(e){var n=e.label&&e.label in this.registeredFunctions;n||t.utils.warn("Function is not registered with pipeline. This may cause problems when serialising the index.\n",e)},t.Pipeline.load=function(e){var n=new t.Pipeline;return e.forEach(function(e){var i=t.Pipeline.registeredFunctions[e];if(!i)throw new Error("Cannot load un-registered function: "+e);n.add(i)}),n},t.Pipeline.prototype.add=function(){var e=Array.prototype.slice.call(arguments);e.forEach(function(e){t.Pipeline.warnIfFunctionNotRegistered(e),this._stack.push(e)},this)},t.Pipeline.prototype.after=function(e,n){t.Pipeline.warnIfFunctionNotRegistered(n);var i=this._stack.indexOf(e);if(-1==i)throw new Error("Cannot find existingFn");i+=1,this._stack.splice(i,0,n)},t.Pipeline.prototype.before=function(e,n){t.Pipeline.warnIfFunctionNotRegistered(n);var i=this._stack.indexOf(e);if(-1==i)throw new Error("Cannot find existingFn");this._stack.splice(i,0,n)},t.Pipeline.prototype.remove=function(t){var e=this._stack.indexOf(t);-1!=e&&this._stack.splice(e,1)},t.Pipeline.prototype.run=function(t){for(var e=[],n=t.length,i=this._stack.length,r=0;n>r;r++){for(var o=t[r],s=0;i>s&&(o=this._stack[s](o,r,t),void 0!==o&&""!==o);s++);void 0!==o&&""!==o&&e.push(o)}return e},t.Pipeline.prototype.reset=function(){this._stack=[]},t.Pipeline.prototype.toJSON=function(){return this._stack.map(function(e){return t.Pipeline.warnIfFunctionNotRegistered(e),e.label})},t.Vector=function(){this._magnitude=null,this.list=void 0,this.length=0},t.Vector.Node=function(t,e,n){this.idx=t,this.val=e,this.next=n},t.Vector.prototype.insert=function(e,n){this._magnitude=void 0;var i=this.list;if(!i)return this.list=new t.Vector.Node(e,n,i),this.length++;if(e<i.idx)return this.list=new t.Vector.Node(e,n,i),this.length++;for(var r=i,o=i.next;void 0!=o;){if(e<o.idx)return r.next=new t.Vector.Node(e,n,o),this.length++;r=o,o=o.next}return r.next=new t.Vector.Node(e,n,o),this.length++},t.Vector.prototype.magnitude=function(){if(this._magnitude)return this._magnitude;for(var t,e=this.list,n=0;e;)t=e.val,n+=t*t,e=e.next;return this._magnitude=Math.sqrt(n)},t.Vector.prototype.dot=function(t){for(var e=this.list,n=t.list,i=0;e&&n;)e.idx<n.idx?e=e.next:e.idx>n.idx?n=n.next:(i+=e.val*n.val,e=e.next,n=n.next);return i},t.Vector.prototype.similarity=function(t){return this.dot(t)/(this.magnitude()*t.magnitude())},t.SortedSet=function(){this.length=0,this.elements=[]},t.SortedSet.load=function(t){var e=new this;return e.elements=t,e.length=t.length,e},t.SortedSet.prototype.add=function(){var t,e;for(t=0;t<arguments.length;t++)e=arguments[t],~this.indexOf(e)||this.elements.splice(this.locationFor(e),0,e);this.length=this.elements.length},t.SortedSet.prototype.toArray=function(){return this.elements.slice()},t.SortedSet.prototype.map=function(t,e){return this.elements.map(t,e)},t.SortedSet.prototype.forEach=function(t,e){return this.elements.forEach(t,e)},t.SortedSet.prototype.indexOf=function(t){for(var e=0,n=this.elements.length,i=n-e,r=e+Math.floor(i/2),o=this.elements[r];i>1;){if(o===t)return r;t>o&&(e=r),o>t&&(n=r),i=n-e,r=e+Math.floor(i/2),o=this.elements[r]}return o===t?r:-1},t.SortedSet.prototype.locationFor=function(t){for(var e=0,n=this.elements.length,i=n-e,r=e+Math.floor(i/2),o=this.elements[r];i>1;)t>o&&(e=r),o>t&&(n=r),i=n-e,r=e+Math.floor(i/2),o=this.elements[r];return o>t?r:t>o?r+1:void 0},t.SortedSet.prototype.intersect=function(e){for(var n=new t.SortedSet,i=0,r=0,o=this.length,s=e.length,a=this.elements,h=e.elements;;){if(i>o-1||r>s-1)break;a[i]!==h[r]?a[i]<h[r]?i++:a[i]>h[r]&&r++:(n.add(a[i]),i++,r++)}return n},t.SortedSet.prototype.clone=function(){var e=new t.SortedSet;return e.elements=this.toArray(),e.length=e.elements.length,e},t.SortedSet.prototype.union=function(t){var e,n,i;this.length>=t.length?(e=this,n=t):(e=t,n=this),i=e.clone();for(var r=0,o=n.toArray();r<o.length;r++)i.add(o[r]);return i},t.SortedSet.prototype.toJSON=function(){return this.toArray()},t.Index=function(){this._fields=[],this._ref="id",this.pipeline=new t.Pipeline,this.documentStore=new t.Store,this.tokenStore=new t.TokenStore,this.corpusTokens=new t.SortedSet,this.eventEmitter=new t.EventEmitter,this.tokenizerFn=t.tokenizer,this._idfCache={},this.on("add","remove","update",function(){this._idfCache={}}.bind(this))},t.Index.prototype.on=function(){var t=Array.prototype.slice.call(arguments);return this.eventEmitter.addListener.apply(this.eventEmitter,t)},t.Index.prototype.off=function(t,e){return this.eventEmitter.removeListener(t,e)},t.Index.load=function(e){e.version!==t.version&&t.utils.warn("version mismatch: current "+t.version+" importing "+e.version);var n=new this;return n._fields=e.fields,n._ref=e.ref,n.tokenizer=t.tokenizer.load(e.tokenizer),n.documentStore=t.Store.load(e.documentStore),n.tokenStore=t.TokenStore.load(e.tokenStore),n.corpusTokens=t.SortedSet.load(e.corpusTokens),n.pipeline=t.Pipeline.load(e.pipeline),n},t.Index.prototype.field=function(t,e){var e=e||{},n={name:t,boost:e.boost||1};return this._fields.push(n),this},t.Index.prototype.ref=function(t){return this._ref=t,this},t.Index.prototype.tokenizer=function(e){var n=e.label&&e.label in t.tokenizer.registeredFunctions;return n||t.utils.warn("Function is not a registered tokenizer. This may cause problems when serialising the index"),this.tokenizerFn=e,this},t.Index.prototype.add=function(e,n){var i={},r=new t.SortedSet,o=e[this._ref],n=void 0===n?!0:n;this._fields.forEach(function(t){var n=this.pipeline.run(this.tokenizerFn(e[t.name]));i[t.name]=n;for(var o=0;o<n.length;o++){var s=n[o];r.add(s),this.corpusTokens.add(s)}},this),this.documentStore.set(o,r);for(var s=0;s<r.length;s++){for(var a=r.elements[s],h=0,u=0;u<this._fields.length;u++){var l=this._fields[u],c=i[l.name],f=c.length;if(f){for(var d=0,p=0;f>p;p++)c[p]===a&&d++;h+=d/f*l.boost}}this.tokenStore.add(a,{ref:o,tf:h})}n&&this.eventEmitter.emit("add",e,this)},t.Index.prototype.remove=function(t,e){var n=t[this._ref],e=void 0===e?!0:e;if(this.documentStore.has(n)){var i=this.documentStore.get(n);this.documentStore.remove(n),i.forEach(function(t){this.tokenStore.remove(t,n)},this),e&&this.eventEmitter.emit("remove",t,this)}},t.Index.prototype.update=function(t,e){var e=void 0===e?!0:e;this.remove(t,!1),this.add(t,!1),e&&this.eventEmitter.emit("update",t,this)},t.Index.prototype.idf=function(t){var e="@"+t;if(Object.prototype.hasOwnProperty.call(this._idfCache,e))return this._idfCache[e];var n=this.tokenStore.count(t),i=1;return n>0&&(i=1+Math.log(this.documentStore.length/n)),this._idfCache[e]=i},t.Index.prototype.search=function(e){var n=this.pipeline.run(this.tokenizerFn(e)),i=new t.Vector,r=[],o=this._fields.reduce(function(t,e){return t+e.boost},0),s=n.some(function(t){return this.tokenStore.has(t)},this);if(!s)return[];n.forEach(function(e,n,s){var a=1/s.length*this._fields.length*o,h=this,u=this.tokenStore.expand(e).reduce(function(n,r){var o=h.corpusTokens.indexOf(r),s=h.idf(r),u=1,l=new t.SortedSet;if(r!==e){var c=Math.max(3,r.length-e.length);u=1/Math.log(c)}o>-1&&i.insert(o,a*s*u);for(var f=h.tokenStore.get(r),d=Object.keys(f),p=d.length,v=0;p>v;v++)l.add(f[d[v]].ref);return n.union(l)},new t.SortedSet);r.push(u)},this);var a=r.reduce(function(t,e){return t.intersect(e)});return a.map(function(t){return{ref:t,score:i.similarity(this.documentVector(t))}},this).sort(function(t,e){return e.score-t.score})},t.Index.prototype.documentVector=function(e){for(var n=this.documentStore.get(e),i=n.length,r=new t.Vector,o=0;i>o;o++){var s=n.elements[o],a=this.tokenStore.get(s)[e].tf,h=this.idf(s);r.insert(this.corpusTokens.indexOf(s),a*h)}return r},t.Index.prototype.toJSON=function(){return{version:t.version,fields:this._fields,ref:this._ref,tokenizer:this.tokenizerFn.label,documentStore:this.documentStore.toJSON(),tokenStore:this.tokenStore.toJSON(),corpusTokens:this.corpusTokens.toJSON(),pipeline:this.pipeline.toJSON()}},t.Index.prototype.use=function(t){var e=Array.prototype.slice.call(arguments,1);e.unshift(this),t.apply(this,e)},t.Store=function(){this.store={},this.length=0},t.Store.load=function(e){var n=new this;return n.length=e.length,n.store=Object.keys(e.store).reduce(function(n,i){return n[i]=t.SortedSet.load(e.store[i]),n},{}),n},t.Store.prototype.set=function(t,e){this.has(t)||this.length++,this.store[t]=e},t.Store.prototype.get=function(t){return this.store[t]},t.Store.prototype.has=function(t){return t in this.store},t.Store.prototype.remove=function(t){this.has(t)&&(delete this.store[t],this.length--)},t.Store.prototype.toJSON=function(){return{store:this.store,length:this.length}},t.stemmer=function(){var t={ational:"ate",tional:"tion",enci:"ence",anci:"ance",izer:"ize",bli:"ble",alli:"al",entli:"ent",eli:"e",ousli:"ous",ization:"ize",ation:"ate",ator:"ate",alism:"al",iveness:"ive",fulness:"ful",ousness:"ous",aliti:"al",iviti:"ive",biliti:"ble",logi:"log"},e={icate:"ic",ative:"",alize:"al",iciti:"ic",ical:"ic",ful:"",ness:""},n="[^aeiou]",i="[aeiouy]",r=n+"[^aeiouy]*",o=i+"[aeiou]*",s="^("+r+")?"+o+r,a="^("+r+")?"+o+r+"("+o+")?$",h="^("+r+")?"+o+r+o+r,u="^("+r+")?"+i,l=new RegExp(s),c=new RegExp(h),f=new RegExp(a),d=new RegExp(u),p=/^(.+?)(ss|i)es$/,v=/^(.+?)([^s])s$/,g=/^(.+?)eed$/,m=/^(.+?)(ed|ing)$/,y=/.$/,S=/(at|bl|iz)$/,w=new RegExp("([^aeiouylsz])\\1$"),k=new RegExp("^"+r+i+"[^aeiouwxy]$"),x=/^(.+?[^aeiou])y$/,b=/^(.+?)(ational|tional|enci|anci|izer|bli|alli|entli|eli|ousli|ization|ation|ator|alism|iveness|fulness|ousness|aliti|iviti|biliti|logi)$/,E=/^(.+?)(icate|ative|alize|iciti|ical|ful|ness)$/,F=/^(.+?)(al|ance|ence|er|ic|able|ible|ant|ement|ment|ent|ou|ism|ate|iti|ous|ive|ize)$/,_=/^(.+?)(s|t)(ion)$/,z=/^(.+?)e$/,O=/ll$/,P=new RegExp("^"+r+i+"[^aeiouwxy]$"),T=function(n){var i,r,o,s,a,h,u;if(n.length<3)return n;if(o=n.substr(0,1),"y"==o&&(n=o.toUpperCase()+n.substr(1)),s=p,a=v,s.test(n)?n=n.replace(s,"$1$2"):a.test(n)&&(n=n.replace(a,"$1$2")),s=g,a=m,s.test(n)){var T=s.exec(n);s=l,s.test(T[1])&&(s=y,n=n.replace(s,""))}else if(a.test(n)){var T=a.exec(n);i=T[1],a=d,a.test(i)&&(n=i,a=S,h=w,u=k,a.test(n)?n+="e":h.test(n)?(s=y,n=n.replace(s,"")):u.test(n)&&(n+="e"))}if(s=x,s.test(n)){var T=s.exec(n);i=T[1],n=i+"i"}if(s=b,s.test(n)){var T=s.exec(n);i=T[1],r=T[2],s=l,s.test(i)&&(n=i+t[r])}if(s=E,s.test(n)){var T=s.exec(n);i=T[1],r=T[2],s=l,s.test(i)&&(n=i+e[r])}if(s=F,a=_,s.test(n)){var T=s.exec(n);i=T[1],s=c,s.test(i)&&(n=i)}else if(a.test(n)){var T=a.exec(n);i=T[1]+T[2],a=c,a.test(i)&&(n=i)}if(s=z,s.test(n)){var T=s.exec(n);i=T[1],s=c,a=f,h=P,(s.test(i)||a.test(i)&&!h.test(i))&&(n=i)}return s=O,a=c,s.test(n)&&a.test(n)&&(s=y,n=n.replace(s,"")),"y"==o&&(n=o.toLowerCase()+n.substr(1)),n};return T}(),t.Pipeline.registerFunction(t.stemmer,"stemmer"),t.generateStopWordFilter=function(t){var e=t.reduce(function(t,e){return t[e]=e,t},{});return function(t){return t&&e[t]!==t?t:void 0}},t.stopWordFilter=t.generateStopWordFilter(["a","able","about","across","after","all","almost","also","am","among","an","and","any","are","as","at","be","because","been","but","by","can","cannot","could","dear","did","do","does","either","else","ever","every","for","from","get","got","had","has","have","he","her","hers","him","his","how","however","i","if","in","into","is","it","its","just","least","let","like","likely","may","me","might","most","must","my","neither","no","nor","not","of","off","often","on","only","or","other","our","own","rather","said","say","says","she","should","since","so","some","than","that","the","their","them","then","there","these","they","this","tis","to","too","twas","us","wants","was","we","were","what","when","where","which","while","who","whom","why","will","with","would","yet","you","your"]),t.Pipeline.registerFunction(t.stopWordFilter,"stopWordFilter"),t.trimmer=function(t){return t.replace(/^\W+/,"").replace(/\W+$/,"")},t.Pipeline.registerFunction(t.trimmer,"trimmer"),t.TokenStore=function(){this.root={docs:{}},this.length=0},t.TokenStore.load=function(t){var e=new this;return e.root=t.root,e.length=t.length,e},t.TokenStore.prototype.add=function(t,e,n){var n=n||this.root,i=t.charAt(0),r=t.slice(1);return i in n||(n[i]={docs:{}}),0===r.length?(n[i].docs[e.ref]=e,void(this.length+=1)):this.add(r,e,n[i])},t.TokenStore.prototype.has=function(t){if(!t)return!1;for(var e=this.root,n=0;n<t.length;n++){if(!e[t.charAt(n)])return!1;e=e[t.charAt(n)]}return!0},t.TokenStore.prototype.getNode=function(t){if(!t)return{};for(var e=this.root,n=0;n<t.length;n++){if(!e[t.charAt(n)])return{};e=e[t.charAt(n)]}return e},t.TokenStore.prototype.get=function(t,e){return this.getNode(t,e).docs||{}},t.TokenStore.prototype.count=function(t,e){return Object.keys(this.get(t,e)).length},t.TokenStore.prototype.remove=function(t,e){if(t){for(var n=this.root,i=0;i<t.length;i++){if(!(t.charAt(i)in n))return;n=n[t.charAt(i)]}delete n.docs[e]}},t.TokenStore.prototype.expand=function(t,e){var n=this.getNode(t),i=n.docs||{},e=e||[];return Object.keys(i).length&&e.push(t),Object.keys(n).forEach(function(n){"docs"!==n&&e.concat(this.expand(t+n,e))},this),e},t.TokenStore.prototype.toJSON=function(){return{root:this.root,length:this.length}},function(t,e){"function"==typeof define&&define.amd?define(e):"object"==typeof exports?module.exports=e():t.lunr=e()}(this,function(){return t})}();
diff --git a/docs/mkdocs/js/mustache.min.js b/docs/search/mustache.min.js
similarity index 100%
rename from docs/mkdocs/js/mustache.min.js
rename to docs/search/mustache.min.js
diff --git a/docs/mkdocs/js/require.js b/docs/search/require.js
similarity index 100%
rename from docs/mkdocs/js/require.js
rename to docs/search/require.js
diff --git a/docs/mkdocs/js/search-results-template.mustache b/docs/search/search-results-template.mustache
similarity index 100%
rename from docs/mkdocs/js/search-results-template.mustache
rename to docs/search/search-results-template.mustache
diff --git a/docs/mkdocs/js/search.js b/docs/search/search.js
similarity index 91%
rename from docs/mkdocs/js/search.js
rename to docs/search/search.js
index 558a66f..2283930 100644
--- a/docs/mkdocs/js/search.js
+++ b/docs/search/search.js
@@ -1,8 +1,12 @@
+require.config({
+   baseUrl: base_url + "/search/"
+});
+
 require([
-    base_url + '/mkdocs/js/mustache.min.js',
-    base_url + '/mkdocs/js/lunr-0.5.7.min.js',
+    'mustache.min',
+    'lunr.min',
     'text!search-results-template.mustache',
-    'text!../search_index.txt',
+    'text!search_index.json',
 ], function (Mustache, lunr, results_template, data) {
    "use strict";
 
@@ -70,7 +74,7 @@ require([
              */
             jQuery('#mkdocs_search_modal a').click(function(){
                 jQuery('#mkdocs_search_modal').modal('hide');
-            })
+            });
         }
 
     };
@@ -83,6 +87,6 @@ require([
         search();
     }
 
-    search_input.addEventListener("keyup", search);
+    if (search_input){search_input.addEventListener("keyup", search);}
 
 });
diff --git a/docs/search/search_index.json b/docs/search/search_index.json
new file mode 100644
index 0000000..3c6080d
--- /dev/null
+++ b/docs/search/search_index.json
@@ -0,0 +1,704 @@
+{
+    "docs": [
+        {
+            "location": "/",
+            "text": "What is Coach?\n\n\nMotivation\n\n\nTrain and evaluate reinforcement learning agents by harnessing the power of multi-core CPU processing to achieve state-of-the-art results. Provide a sandbox for easing the development process of new algorithms through a modular design and an elegant set of APIs. \n\n\nSolution\n\n\nCoach is a python environment which models the interaction between an agent and an environment in a modular way.\nWith Coach, it is possible to model an agent by combining various building blocks, and training the agent on multiple environments.\nThe available environments allow testing the agent in different practical fields such as robotics, autonomous driving, games and more. \nCoach collects statistics from the training process and supports advanced visualization techniques for debugging the agent being trained.\n\n\nBlog post from the Intel\u00ae AI website can be found \nhere\n.\n\n\nGitHub repository is \nhere\n. \n\n\nDesign",
+            "title": "Home"
+        },
+        {
+            "location": "/#what-is-coach",
+            "text": "",
+            "title": "What is Coach?"
+        },
+        {
+            "location": "/#motivation",
+            "text": "Train and evaluate reinforcement learning agents by harnessing the power of multi-core CPU processing to achieve state-of-the-art results. Provide a sandbox for easing the development process of new algorithms through a modular design and an elegant set of APIs.",
+            "title": "Motivation"
+        },
+        {
+            "location": "/#solution",
+            "text": "Coach is a python environment which models the interaction between an agent and an environment in a modular way.\nWith Coach, it is possible to model an agent by combining various building blocks, and training the agent on multiple environments.\nThe available environments allow testing the agent in different practical fields such as robotics, autonomous driving, games and more. \nCoach collects statistics from the training process and supports advanced visualization techniques for debugging the agent being trained.  Blog post from the Intel\u00ae AI website can be found  here .  GitHub repository is  here .",
+            "title": "Solution"
+        },
+        {
+            "location": "/#design",
+            "text": "",
+            "title": "Design"
+        },
+        {
+            "location": "/usage/",
+            "text": "Coach Usage\n\n\nTraining an Agent\n\n\nSingle-threaded Algorithms\n\n\nThis is the most common case. Just choose a preset using the \n-p\n flag and press enter.\n\n\nExample:\n\n\npython coach.py -p CartPole_DQN\n\n\nMulti-threaded Algorithms\n\n\nMulti-threaded algorithms are very common this days.\nThey typically achieve the best results, and scale gracefully with the number of threads.\nIn Coach, running such algorithms is done by selecting a suitable preset, and choosing the number of threads to run using the \n-n\n flag.\n\n\nExample:\n\n\npython coach.py -p CartPole_A3C -n 8\n\n\nEvaluating an Agent\n\n\nThere are several options for evaluating an agent during the training:\n\n\n\n\n\n\nFor multi-threaded runs, an evaluation agent will constantly run in the background and evaluate the model during the training.\n\n\n\n\n\n\nFor single-threaded runs, it is possible to define an evaluation period through the preset. This will run several episodes of evaluation once in a while.\n\n\n\n\n\n\nAdditionally, it is possible to save checkpoints of the agents networks and then run only in evaluation mode.\nSaving checkpoints can be done by specifying the number of seconds between storing checkpoints using the \n-s\n flag.\nThe checkpoints will be saved into the experiment directory.\nLoading a model for evaluation can be done by specifying the \n-crd\n flag with the experiment directory, and the \n--evaluate\n flag to disable training.\n\n\nExample:\n\n\npython coach.py -p CartPole_DQN -s 60\n\n\npython coach.py -p CartPole_DQN --evaluate -crd CHECKPOINT_RESTORE_DIR\n\n\nPlaying with the Environment as a Human\n\n\nInteracting with the environment as a human can be useful for understanding its difficulties and for collecting data for imitation learning.\nIn Coach, this can be easily done by selecting a preset that defines the environment to use, and specifying the \n--play\n flag.\nWhen the environment is loaded, the available keyboard buttons will be printed to the screen.\nPressing the escape key when finished will end the simulation and store the replay buffer in the experiment dir.\n\n\nExample:\n\n\npython coach.py -p Breakout_DQN --play\n\n\nLearning Through Imitation Learning\n\n\nLearning through imitation of human behavior is a nice way to speedup the learning.\nIn Coach, this can be done in two steps -\n\n\n\n\n\n\nCreate a dataset of demonstrations by playing with the environment as a human.\n   After this step, a pickle of the replay buffer containing your game play will be stored in the experiment directory.\n   The path to this replay buffer will be printed to the screen.\n   To do so, you should select an environment type and level through the command line, and specify the \n--play\n flag.\n\n\nExample:\n\n\npython coach.py -et Doom -lvl Basic --play\n\n\n\n\n\n\nNext, use an imitation learning preset and set the replay buffer path accordingly.\n    The path can be set either from the command line or from the preset itself.\n\n\nExample:\n\n\npython coach.py -p Doom_Basic_BC -cp='agent.load_memory_from_file_path=\\\"<experiment dir>/replay_buffer.p\\\"'\n\n\n\n\n\n\nVisualizations\n\n\nRendering the Environment\n\n\nRendering the environment can be done by using the \n-r\n flag.\nWhen working with multi-threaded algorithms, the rendered image will be representing the game play of the evaluation worker.\nWhen working with single-threaded algorithms, the rendered image will be representing the single worker which can be either training or evaluating.\nKeep in mind that rendering the environment in single-threaded algorithms may slow the training to some extent.\nWhen playing with the environment using the \n--play\n flag, the environment will be rendered automatically without the need for specifying the \n-r\n flag.\n\n\nExample:\n\n\npython coach.py -p Breakout_DQN -r\n\n\nDumping GIFs\n\n\nCoach allows storing GIFs of the agent game play.\nTo dump GIF files, use the \n-dg\n flag.\nThe files are dumped after every evaluation episode, and are saved into the experiment directory, under a gifs sub-directory.\n\n\nExample:\n\n\npython coach.py -p Breakout_A3C -n 4 -dg\n\n\nSwitching between deep learning frameworks\n\n\nCoach uses TensorFlow as its main backend framework, but it also supports neon for some of the algorithms.\nBy default, TensorFlow will be used. It is possible to switch to neon using the \n-f\n flag.\n\n\nExample:\n\n\npython coach.py -p Doom_Basic_DQN -f neon\n\n\nAdditional Flags\n\n\nThere are several convenient flags which are important to know about.\nHere we will list most of the flags, but these can be updated from time to time.\nThe most up to date description can be found by using the \n-h\n flag.\n\n\n\n\n\n\n\n\nFlag\n\n\nType\n\n\nDescription\n\n\n\n\n\n\n\n\n\n\n-p PRESET\n, \n`--preset PRESET\n\n\nstring\n\n\nName of a preset to run (as configured in presets.py)\n\n\n\n\n\n\n-l\n, \n--list\n\n\nflag\n\n\nList all available presets\n\n\n\n\n\n\n-e EXPERIMENT_NAME\n, \n--experiment_name EXPERIMENT_NAME\n\n\nstring\n\n\nExperiment name to be used to store the results.\n\n\n\n\n\n\n-r\n, \n--render\n\n\nflag\n\n\nRender environment\n\n\n\n\n\n\n-f FRAMEWORK\n, \n--framework FRAMEWORK\n\n\nstring\n\n\nNeural network framework. Available values: tensorflow, neon\n\n\n\n\n\n\n-n NUM_WORKERS\n, \n--num_workers NUM_WORKERS\n\n\nint\n\n\nNumber of workers for multi-process based agents, e.g. A3C\n\n\n\n\n\n\n--play\n\n\nflag\n\n\nPlay as a human by controlling the game with the keyboard. This option will save a replay buffer with the game play.\n\n\n\n\n\n\n--evaluate\n\n\nflag\n\n\nRun evaluation only. This is a convenient way to disable training in order to evaluate an existing checkpoint.\n\n\n\n\n\n\n-v\n, \n--verbose\n\n\nflag\n\n\nDon't suppress TensorFlow debug prints.\n\n\n\n\n\n\n-s SAVE_MODEL_SEC\n, \n--save_model_sec SAVE_MODEL_SEC\n\n\nint\n\n\nTime in seconds between saving checkpoints of the model.\n\n\n\n\n\n\n-crd CHECKPOINT_RESTORE_DIR\n, \n--checkpoint_restore_dir CHECKPOINT_RESTORE_DIR\n\n\nstring\n\n\nPath to a folder containing a checkpoint to restore the model from.\n\n\n\n\n\n\n-dg\n, \n--dump_gifs\n\n\nflag\n\n\nEnable the gif saving functionality.\n\n\n\n\n\n\n-at AGENT_TYPE\n, \n--agent_type AGENT_TYPE\n\n\nstring\n\n\nChoose an agent type class to override on top of the selected preset. If no preset is defined, a preset can be set from the command-line by combining settings which are set by using \n--agent_type\n, \n--experiment_type\n, \n--environemnt_type\n\n\n\n\n\n\n-et ENVIRONMENT_TYPE\n, \n--environment_type ENVIRONMENT_TYPE\n\n\nstring\n\n\nChoose an environment type class to override on top of the selected preset. If no preset is defined, a preset can be set from the command-line by combining settings which are set by using \n--agent_type\n, \n--experiment_type\n, \n--environemnt_type\n\n\n\n\n\n\n-ept EXPLORATION_POLICY_TYPE\n, \n--exploration_policy_type EXPLORATION_POLICY_TYPE\n\n\nstring\n\n\nChoose an exploration policy type class to override on top of the selected preset.If no preset is defined, a preset can be set from the command-line by combining settings which are set by using \n--agent_type\n, \n--experiment_type\n, \n--environemnt_type\n\n\n\n\n\n\n-lvl LEVEL\n, \n--level LEVEL\n\n\nstring\n\n\nChoose the level that will be played in the environment that was selected. This value will override the level parameter in the environment class.\n\n\n\n\n\n\n-cp CUSTOM_PARAMETER\n, \n--custom_parameter CUSTOM_PARAMETER\n\n\nstring\n\n\nSemicolon separated parameters used to override specific parameters on top of the selected preset (or on top of the command-line assembled one). Whenever a parameter value is a string, it should be inputted as \n'\\\"string\\\"'\n. For ex.: \n\"visualization.render=False;\n \nnum_training_iterations=500;\n \noptimizer='rmsprop'\"",
+            "title": "Usage"
+        },
+        {
+            "location": "/usage/#coach-usage",
+            "text": "",
+            "title": "Coach Usage"
+        },
+        {
+            "location": "/usage/#training-an-agent",
+            "text": "",
+            "title": "Training an Agent"
+        },
+        {
+            "location": "/usage/#single-threaded-algorithms",
+            "text": "This is the most common case. Just choose a preset using the  -p  flag and press enter.  Example:  python coach.py -p CartPole_DQN",
+            "title": "Single-threaded Algorithms"
+        },
+        {
+            "location": "/usage/#multi-threaded-algorithms",
+            "text": "Multi-threaded algorithms are very common this days.\nThey typically achieve the best results, and scale gracefully with the number of threads.\nIn Coach, running such algorithms is done by selecting a suitable preset, and choosing the number of threads to run using the  -n  flag.  Example:  python coach.py -p CartPole_A3C -n 8",
+            "title": "Multi-threaded Algorithms"
+        },
+        {
+            "location": "/usage/#evaluating-an-agent",
+            "text": "There are several options for evaluating an agent during the training:    For multi-threaded runs, an evaluation agent will constantly run in the background and evaluate the model during the training.    For single-threaded runs, it is possible to define an evaluation period through the preset. This will run several episodes of evaluation once in a while.    Additionally, it is possible to save checkpoints of the agents networks and then run only in evaluation mode.\nSaving checkpoints can be done by specifying the number of seconds between storing checkpoints using the  -s  flag.\nThe checkpoints will be saved into the experiment directory.\nLoading a model for evaluation can be done by specifying the  -crd  flag with the experiment directory, and the  --evaluate  flag to disable training.  Example:  python coach.py -p CartPole_DQN -s 60  python coach.py -p CartPole_DQN --evaluate -crd CHECKPOINT_RESTORE_DIR",
+            "title": "Evaluating an Agent"
+        },
+        {
+            "location": "/usage/#playing-with-the-environment-as-a-human",
+            "text": "Interacting with the environment as a human can be useful for understanding its difficulties and for collecting data for imitation learning.\nIn Coach, this can be easily done by selecting a preset that defines the environment to use, and specifying the  --play  flag.\nWhen the environment is loaded, the available keyboard buttons will be printed to the screen.\nPressing the escape key when finished will end the simulation and store the replay buffer in the experiment dir.  Example:  python coach.py -p Breakout_DQN --play",
+            "title": "Playing with the Environment as a Human"
+        },
+        {
+            "location": "/usage/#learning-through-imitation-learning",
+            "text": "Learning through imitation of human behavior is a nice way to speedup the learning.\nIn Coach, this can be done in two steps -    Create a dataset of demonstrations by playing with the environment as a human.\n   After this step, a pickle of the replay buffer containing your game play will be stored in the experiment directory.\n   The path to this replay buffer will be printed to the screen.\n   To do so, you should select an environment type and level through the command line, and specify the  --play  flag.  Example:  python coach.py -et Doom -lvl Basic --play    Next, use an imitation learning preset and set the replay buffer path accordingly.\n    The path can be set either from the command line or from the preset itself.  Example:  python coach.py -p Doom_Basic_BC -cp='agent.load_memory_from_file_path=\\\"<experiment dir>/replay_buffer.p\\\"'",
+            "title": "Learning Through Imitation Learning"
+        },
+        {
+            "location": "/usage/#visualizations",
+            "text": "",
+            "title": "Visualizations"
+        },
+        {
+            "location": "/usage/#rendering-the-environment",
+            "text": "Rendering the environment can be done by using the  -r  flag.\nWhen working with multi-threaded algorithms, the rendered image will be representing the game play of the evaluation worker.\nWhen working with single-threaded algorithms, the rendered image will be representing the single worker which can be either training or evaluating.\nKeep in mind that rendering the environment in single-threaded algorithms may slow the training to some extent.\nWhen playing with the environment using the  --play  flag, the environment will be rendered automatically without the need for specifying the  -r  flag.  Example:  python coach.py -p Breakout_DQN -r",
+            "title": "Rendering the Environment"
+        },
+        {
+            "location": "/usage/#dumping-gifs",
+            "text": "Coach allows storing GIFs of the agent game play.\nTo dump GIF files, use the  -dg  flag.\nThe files are dumped after every evaluation episode, and are saved into the experiment directory, under a gifs sub-directory.  Example:  python coach.py -p Breakout_A3C -n 4 -dg",
+            "title": "Dumping GIFs"
+        },
+        {
+            "location": "/usage/#switching-between-deep-learning-frameworks",
+            "text": "Coach uses TensorFlow as its main backend framework, but it also supports neon for some of the algorithms.\nBy default, TensorFlow will be used. It is possible to switch to neon using the  -f  flag.  Example:  python coach.py -p Doom_Basic_DQN -f neon",
+            "title": "Switching between deep learning frameworks"
+        },
+        {
+            "location": "/usage/#additional-flags",
+            "text": "There are several convenient flags which are important to know about.\nHere we will list most of the flags, but these can be updated from time to time.\nThe most up to date description can be found by using the  -h  flag.     Flag  Type  Description      -p PRESET ,  `--preset PRESET  string  Name of a preset to run (as configured in presets.py)    -l ,  --list  flag  List all available presets    -e EXPERIMENT_NAME ,  --experiment_name EXPERIMENT_NAME  string  Experiment name to be used to store the results.    -r ,  --render  flag  Render environment    -f FRAMEWORK ,  --framework FRAMEWORK  string  Neural network framework. Available values: tensorflow, neon    -n NUM_WORKERS ,  --num_workers NUM_WORKERS  int  Number of workers for multi-process based agents, e.g. A3C    --play  flag  Play as a human by controlling the game with the keyboard. This option will save a replay buffer with the game play.    --evaluate  flag  Run evaluation only. This is a convenient way to disable training in order to evaluate an existing checkpoint.    -v ,  --verbose  flag  Don't suppress TensorFlow debug prints.    -s SAVE_MODEL_SEC ,  --save_model_sec SAVE_MODEL_SEC  int  Time in seconds between saving checkpoints of the model.    -crd CHECKPOINT_RESTORE_DIR ,  --checkpoint_restore_dir CHECKPOINT_RESTORE_DIR  string  Path to a folder containing a checkpoint to restore the model from.    -dg ,  --dump_gifs  flag  Enable the gif saving functionality.    -at AGENT_TYPE ,  --agent_type AGENT_TYPE  string  Choose an agent type class to override on top of the selected preset. If no preset is defined, a preset can be set from the command-line by combining settings which are set by using  --agent_type ,  --experiment_type ,  --environemnt_type    -et ENVIRONMENT_TYPE ,  --environment_type ENVIRONMENT_TYPE  string  Choose an environment type class to override on top of the selected preset. If no preset is defined, a preset can be set from the command-line by combining settings which are set by using  --agent_type ,  --experiment_type ,  --environemnt_type    -ept EXPLORATION_POLICY_TYPE ,  --exploration_policy_type EXPLORATION_POLICY_TYPE  string  Choose an exploration policy type class to override on top of the selected preset.If no preset is defined, a preset can be set from the command-line by combining settings which are set by using  --agent_type ,  --experiment_type ,  --environemnt_type    -lvl LEVEL ,  --level LEVEL  string  Choose the level that will be played in the environment that was selected. This value will override the level parameter in the environment class.    -cp CUSTOM_PARAMETER ,  --custom_parameter CUSTOM_PARAMETER  string  Semicolon separated parameters used to override specific parameters on top of the selected preset (or on top of the command-line assembled one). Whenever a parameter value is a string, it should be inputted as  '\\\"string\\\"' . For ex.:  \"visualization.render=False;   num_training_iterations=500;   optimizer='rmsprop'\"",
+            "title": "Additional Flags"
+        },
+        {
+            "location": "/design/features/",
+            "text": "Coach Features\n\n\nSupported Algorithms\n\n\nCoach supports many state-of-the-art reinforcement learning algorithms, which are separated into two main classes -\nvalue optimization and policy optimization. A detailed description of those algorithms may be found in the algorithms\nsection.\n\n\n\n\n\n\n\n\n\n\n\nSupported Environments\n\n\nCoach supports a large number of environments which can be solved using reinforcement learning:\n\n\n\n\n\n\nDeepMind Control Suite\n - a set of reinforcement learning environments\n  powered by the MuJoCo physics engine.\n\n\n\n\n\n\nBlizzard Starcraft II\n - a popular strategy game which was wrapped with a\n  python interface by DeepMind.\n\n\n\n\n\n\nViZDoom\n - a Doom-based AI research platform for reinforcement learning\n  from raw visual information.\n\n\n\n\n\n\nCARLA\n - an open-source simulator for autonomous driving research.\n\n\n\n\n\n\nOpenAI Gym\n - a library which consists of a set of environments, from games to robotics.\n  Additionally, it can be extended using the API defined by the authors.\n\n\n\n\n\n\nIn Coach, we support all the native environments in Gym, along with several extensions such as:\n\n\n\n\n\n\nRoboschool\n - a set of environments powered by the PyBullet engine,\n    that offer a free alternative to MuJoCo.\n\n\n\n\n\n\nGym Extensions\n - a set of environments that extends Gym for\n    auxiliary tasks (multitask learning, transfer learning, inverse reinforcement learning, etc.)\n\n\n\n\n\n\nPyBullet\n - a physics engine that\n    includes a set of robotics environments.",
+            "title": "Features"
+        },
+        {
+            "location": "/design/features/#coach-features",
+            "text": "",
+            "title": "Coach Features"
+        },
+        {
+            "location": "/design/features/#supported-algorithms",
+            "text": "Coach supports many state-of-the-art reinforcement learning algorithms, which are separated into two main classes -\nvalue optimization and policy optimization. A detailed description of those algorithms may be found in the algorithms\nsection.",
+            "title": "Supported Algorithms"
+        },
+        {
+            "location": "/design/features/#supported-environments",
+            "text": "Coach supports a large number of environments which can be solved using reinforcement learning:    DeepMind Control Suite  - a set of reinforcement learning environments\n  powered by the MuJoCo physics engine.    Blizzard Starcraft II  - a popular strategy game which was wrapped with a\n  python interface by DeepMind.    ViZDoom  - a Doom-based AI research platform for reinforcement learning\n  from raw visual information.    CARLA  - an open-source simulator for autonomous driving research.    OpenAI Gym  - a library which consists of a set of environments, from games to robotics.\n  Additionally, it can be extended using the API defined by the authors.    In Coach, we support all the native environments in Gym, along with several extensions such as:    Roboschool  - a set of environments powered by the PyBullet engine,\n    that offer a free alternative to MuJoCo.    Gym Extensions  - a set of environments that extends Gym for\n    auxiliary tasks (multitask learning, transfer learning, inverse reinforcement learning, etc.)    PyBullet  - a physics engine that\n    includes a set of robotics environments.",
+            "title": "Supported Environments"
+        },
+        {
+            "location": "/design/control_flow/",
+            "text": "Coach Control Flow\n\n\nCoach is built in a modular way, encouraging modules reuse and reducing the amount of boilerplate code needed\nfor developing new algorithms or integrating a new challenge as an environment.\nOn the other hand, it can be overwhelming for new users to ramp up on the code.\nTo help with that, here's a short overview of the control flow.\n\n\nGraph Manager\n\n\nThe main entry point for Coach is \ncoach.py\n.\nThe main functionality of this script is to parse the command line arguments and invoke all the sub-processes needed\nfor the given experiment.\n\ncoach.py\n executes the given \npreset\n file which returns a \nGraphManager\n object.\n\n\nA \npreset\n is a design pattern that is intended for concentrating the entire definition of an experiment in a single\nfile. This helps with experiments reproducibility, improves readability and prevents confusion.\nThe outcome of a preset is a \nGraphManager\n which will usually be instantiated in the final lines of the preset.\n\n\nA \nGraphManager\n is an object that holds all the agents and environments of an experiment, and is mostly responsible\nfor scheduling their work. Why is it called a \ngraph\n manager? Because agents and environments are structured into\na graph of interactions. For example, in hierarchical reinforcement learning schemes, there will often be a master\npolicy agent, that will control a sub-policy agent, which will interact with the environment. Other schemes can have\nmuch more complex graphs of control, such as several hierarchy layers, each with multiple agents.\nThe graph manager's main loop is the improve loop.\n\n\n\n\n\n\n\n\n\n\n\nThe improve loop skips between 3 main phases - heatup, training and evaluation:\n\n\n\n\n\n\nHeatup\n - the goal of this phase is to collect initial data for populating the replay buffers. The heatup phase\n  takes place only in the beginning of the experiment, and the agents will act completely randomly during this phase.\n  Importantly, the agents do not train their networks during this phase. DQN for example, uses 50k random steps in order\n  to initialize the replay buffers.\n\n\n\n\n\n\nTraining\n - the training phase is the main phase of the experiment. This phase can change between agent types,\n  but essentially consists of repeated cycles of acting, collecting data from the environment, and training the agent\n  networks. During this phase, the agent will use its exploration policy in training mode, which will add noise to its\n  actions in order to improve its knowledge about the environment state space.\n\n\n\n\n\n\nEvaluation\n - the evaluation phase is intended for evaluating the current performance of the agent. The agents\n  will act greedily in order to exploit the knowledge aggregated so far and the performance over multiple episodes of\n  evaluation will be averaged in order to reduce the stochasticity effects of all the components.\n\n\n\n\n\n\nLevel Manager\n\n\nIn each of the 3 phases described above, the graph manager will invoke all the hierarchy levels in the graph in a\nsynchronized manner. In Coach, agents do not interact directly with the environment. Instead, they go through a\n\nLevelManager\n, which is a proxy that manages their interaction. The level manager passes the current state and reward\nfrom the environment to the agent, and the actions from the agent to the environment.\n\n\nThe motivation for having a level manager is to disentangle the code of the environment and the agent, so to allow more\ncomplex interactions. Each level can have multiple agents which interact with the environment. Who gets to choose the\naction for each step is controlled by the level manager.\nAdditionally, each level manager can act as an environment for the hierarchy level above it, such that each hierarchy\nlevel can be seen as an interaction between an agent and an environment, even if the environment is just more agents in\na lower hierarchy level.\n\n\nAgent\n\n\nThe base agent class has 3 main function that will be used during those phases - observe, act and train.\n\n\n\n\nObserve\n - this function gets the latest response from the environment as input, and updates the internal state\n  of the agent with the new information. The environment response will\n  be first passed through the agent's \nInputFilter\n object, which will process the values in the response, according\n  to the specific agent definition. The environment response will then be converted into a\n  \nTransition\n which will contain the information from a single step\n  (\n s_{t}, a_{t}, r_{t}, s_{t+1}, terminal signal \n), and store it in the memory.\n\n\n\n\n\n\n\n\nAct\n - this function uses the current internal state of the agent in order to select the next action to take on\n  the environment. This function will call the per-agent custom function \nchoose_action\n that will use the network\n  and the exploration policy in order to select an action. The action will be stored, together with any additional\n  information (like the action value for example) in an \nActionInfo\n object. The ActionInfo object will then be\n  passed through the agent's \nOutputFilter\n to allow any processing of the action (like discretization,\n  or shifting, for example), before passing it to the environment.\n\n\n\n\n\n\n\n\nTrain\n - this function will sample a batch from the memory and train on it. The batch of transitions will be\n  first wrapped into a \nBatch\n object to allow efficient querying of the batch values. It will then be passed into\n  the agent specific \nlearn_from_batch\n function, that will extract network target values from the batch and will\n  train the networks accordingly. Lastly, if there's a target network defined for the agent, it will sync the target\n  network weights with the online network.",
+            "title": "Control Flow"
+        },
+        {
+            "location": "/design/control_flow/#coach-control-flow",
+            "text": "Coach is built in a modular way, encouraging modules reuse and reducing the amount of boilerplate code needed\nfor developing new algorithms or integrating a new challenge as an environment.\nOn the other hand, it can be overwhelming for new users to ramp up on the code.\nTo help with that, here's a short overview of the control flow.",
+            "title": "Coach Control Flow"
+        },
+        {
+            "location": "/design/control_flow/#graph-manager",
+            "text": "The main entry point for Coach is  coach.py .\nThe main functionality of this script is to parse the command line arguments and invoke all the sub-processes needed\nfor the given experiment. coach.py  executes the given  preset  file which returns a  GraphManager  object.  A  preset  is a design pattern that is intended for concentrating the entire definition of an experiment in a single\nfile. This helps with experiments reproducibility, improves readability and prevents confusion.\nThe outcome of a preset is a  GraphManager  which will usually be instantiated in the final lines of the preset.  A  GraphManager  is an object that holds all the agents and environments of an experiment, and is mostly responsible\nfor scheduling their work. Why is it called a  graph  manager? Because agents and environments are structured into\na graph of interactions. For example, in hierarchical reinforcement learning schemes, there will often be a master\npolicy agent, that will control a sub-policy agent, which will interact with the environment. Other schemes can have\nmuch more complex graphs of control, such as several hierarchy layers, each with multiple agents.\nThe graph manager's main loop is the improve loop.     The improve loop skips between 3 main phases - heatup, training and evaluation:    Heatup  - the goal of this phase is to collect initial data for populating the replay buffers. The heatup phase\n  takes place only in the beginning of the experiment, and the agents will act completely randomly during this phase.\n  Importantly, the agents do not train their networks during this phase. DQN for example, uses 50k random steps in order\n  to initialize the replay buffers.    Training  - the training phase is the main phase of the experiment. This phase can change between agent types,\n  but essentially consists of repeated cycles of acting, collecting data from the environment, and training the agent\n  networks. During this phase, the agent will use its exploration policy in training mode, which will add noise to its\n  actions in order to improve its knowledge about the environment state space.    Evaluation  - the evaluation phase is intended for evaluating the current performance of the agent. The agents\n  will act greedily in order to exploit the knowledge aggregated so far and the performance over multiple episodes of\n  evaluation will be averaged in order to reduce the stochasticity effects of all the components.",
+            "title": "Graph Manager"
+        },
+        {
+            "location": "/design/control_flow/#level-manager",
+            "text": "In each of the 3 phases described above, the graph manager will invoke all the hierarchy levels in the graph in a\nsynchronized manner. In Coach, agents do not interact directly with the environment. Instead, they go through a LevelManager , which is a proxy that manages their interaction. The level manager passes the current state and reward\nfrom the environment to the agent, and the actions from the agent to the environment.  The motivation for having a level manager is to disentangle the code of the environment and the agent, so to allow more\ncomplex interactions. Each level can have multiple agents which interact with the environment. Who gets to choose the\naction for each step is controlled by the level manager.\nAdditionally, each level manager can act as an environment for the hierarchy level above it, such that each hierarchy\nlevel can be seen as an interaction between an agent and an environment, even if the environment is just more agents in\na lower hierarchy level.",
+            "title": "Level Manager"
+        },
+        {
+            "location": "/design/control_flow/#agent",
+            "text": "The base agent class has 3 main function that will be used during those phases - observe, act and train.   Observe  - this function gets the latest response from the environment as input, and updates the internal state\n  of the agent with the new information. The environment response will\n  be first passed through the agent's  InputFilter  object, which will process the values in the response, according\n  to the specific agent definition. The environment response will then be converted into a\n   Transition  which will contain the information from a single step\n  (  s_{t}, a_{t}, r_{t}, s_{t+1}, terminal signal  ), and store it in the memory.     Act  - this function uses the current internal state of the agent in order to select the next action to take on\n  the environment. This function will call the per-agent custom function  choose_action  that will use the network\n  and the exploration policy in order to select an action. The action will be stored, together with any additional\n  information (like the action value for example) in an  ActionInfo  object. The ActionInfo object will then be\n  passed through the agent's  OutputFilter  to allow any processing of the action (like discretization,\n  or shifting, for example), before passing it to the environment.     Train  - this function will sample a batch from the memory and train on it. The batch of transitions will be\n  first wrapped into a  Batch  object to allow efficient querying of the batch values. It will then be passed into\n  the agent specific  learn_from_batch  function, that will extract network target values from the batch and will\n  train the networks accordingly. Lastly, if there's a target network defined for the agent, it will sync the target\n  network weights with the online network.",
+            "title": "Agent"
+        },
+        {
+            "location": "/design/network/",
+            "text": "Network Design\n\n\nEach agent has at least one neural network, used as the function approximator, for choosing the actions. The network is designed in a modular way to allow reusability in different agents. It is separated into three main parts:\n\n\n\n\n\n\nInput Embedders\n - This is the first stage of the network, meant to convert the input into a feature vector representation. It is possible to combine several instances of any of the supported embedders, in order to allow varied combinations of inputs. \n\n\nThere are two main types of input embedders: \n\n\n\n\nImage embedder - Convolutional neural network. \n\n\nVector embedder - Multi-layer perceptron. \n\n\n\n\n\n\n\n\nMiddlewares\n - The middleware gets the output of the input embedder, and processes it into a different representation domain, before sending it through the output head. The goal of the middleware is to enable processing the combined outputs of several input embedders, and pass them through some extra processing. This, for instance, might include an LSTM or just a plain simple FC layer.\n\n\n\n\n\n\nOutput Heads\n - The output head is used in order to predict the values required from the network. These might include action-values, state-values or a policy. As with the input embedders, it is possible to use several output heads in the same network. For example, the \nActor Critic\n agent combines two heads - a policy head and a state-value head.\n  In addition, the output heads defines the loss function according to the head type.\n\n\n\n\n\n\n\u200b\n\n\n\n\n\n\n\n\n\n\n\nKeeping Network Copies in Sync\n\n\nMost of the reinforcement learning agents include more than one copy of the neural network. These copies serve as counterparts of the main network which are updated in different rates, and are often synchronized either locally or between parallel workers. For easier synchronization of those copies, a wrapper around these copies exposes a simplified API, which allows hiding these complexities from the agent.",
+            "title": "Network"
+        },
+        {
+            "location": "/design/network/#network-design",
+            "text": "Each agent has at least one neural network, used as the function approximator, for choosing the actions. The network is designed in a modular way to allow reusability in different agents. It is separated into three main parts:    Input Embedders  - This is the first stage of the network, meant to convert the input into a feature vector representation. It is possible to combine several instances of any of the supported embedders, in order to allow varied combinations of inputs.   There are two main types of input embedders:    Image embedder - Convolutional neural network.   Vector embedder - Multi-layer perceptron.      Middlewares  - The middleware gets the output of the input embedder, and processes it into a different representation domain, before sending it through the output head. The goal of the middleware is to enable processing the combined outputs of several input embedders, and pass them through some extra processing. This, for instance, might include an LSTM or just a plain simple FC layer.    Output Heads  - The output head is used in order to predict the values required from the network. These might include action-values, state-values or a policy. As with the input embedders, it is possible to use several output heads in the same network. For example, the  Actor Critic  agent combines two heads - a policy head and a state-value head.\n  In addition, the output heads defines the loss function according to the head type.    \u200b",
+            "title": "Network Design"
+        },
+        {
+            "location": "/design/network/#keeping-network-copies-in-sync",
+            "text": "Most of the reinforcement learning agents include more than one copy of the neural network. These copies serve as counterparts of the main network which are updated in different rates, and are often synchronized either locally or between parallel workers. For easier synchronization of those copies, a wrapper around these copies exposes a simplified API, which allows hiding these complexities from the agent.",
+            "title": "Keeping Network Copies in Sync"
+        },
+        {
+            "location": "/design/filters/",
+            "text": "Filters\n\n\nFilters are a mechanism in Coach that allows doing pre-processing and post-processing of the internal agent information.\nThere are two filter categories -\n\n\n\n\n\n\nInput filters\n - these are filters that process the information passed \ninto\n the agent from the environment.\n  This information includes the observation and the reward. Input filters therefore allow rescaling observations,\n  normalizing rewards, stack observations, etc.\n\n\n\n\n\n\nOutput filters\n - these are filters that process the information going \nout\n of the agent into the environment.\n  This information includes the action the agent chooses to take. Output filters therefore allow conversion of\n  actions from one space into another. For example, the agent can take \n N \n discrete actions, that will be mapped by\n  the output filter onto \n N \n continuous actions.\n\n\n\n\n\n\nFilters can be stacked on top of each other in order to build complex processing flows of the inputs or outputs.\n\n\n\n\n\n\n\n\n\n\n\nInput Filters\n\n\nThe input filters are separated into two categories - \nobservation filters\n and \nreward filters\n.\n\n\nObservation Filters\n\n\n\n\n\n\nObservationClippingFilter\n - Clips the observation values to a given range of values. For example, if the\n  observation consists of measurements in an arbitrary range, and we want to control the minimum and maximum values\n  of these observations, we can define a range and clip the values of the measurements.\n\n\n\n\n\n\nObservationCropFilter\n - Crops the size of the observation to a given crop window. For example, in Atari, the\n  observations are images with a shape of 210x160. Usually, we will want to crop the size of the observation to a\n  square of 160x160 before rescaling them.\n\n\n\n\n\n\nObservationMoveAxisFilter\n - Reorders the axes of the observation. This can be useful when the observation is an\n  image, and we want to move the channel axis to be the last axis instead of the first axis.\n\n\n\n\n\n\nObservationNormalizationFilter\n - Normalizes the observation values with a running mean and standard deviation of\n  all the observations seen so far. The normalization is performed element-wise. Additionally, when working with\n  multiple workers, the statistics used for the normalization operation are accumulated over all the workers.\n\n\n\n\n\n\nObservationReductionBySubPartsNameFilter\n - Allows keeping only parts of the observation, by specifying their\n  name. For example, the CARLA environment extracts multiple measurements that can be used by the agent, such as\n  speed and location. If we want to only use the speed, it can be done using this filter.\n\n\n\n\n\n\nObservationRescaleSizeByFactorFilter\n - Rescales an image observation by some factor. For example, the image size\n  can be reduced by a factor of 2.\n\n\n\n\n\n\nObservationRescaleToSizeFilter\n - Rescales an image observation to a given size. The target size does not\n  necessarily keep the aspect ratio of the original observation.\n\n\n\n\n\n\nObservationRGBToYFilter\n - Converts a color image observation specified using the RGB encoding into a grayscale\n  image observation, by keeping only the luminance (Y) channel of the YUV encoding. This can be useful if the colors\n  in the original image are not relevant for solving the task at hand.\n\n\n\n\n\n\nObservationSqueezeFilter\n - Removes redundant axes from the observation, which are axes with a dimension of 1.\n\n\n\n\n\n\nObservationStackingFilter\n - Stacks several observations on top of each other. For image observation this will\n  create a 3D blob. The stacking is done in a lazy manner in order to reduce memory consumption. To achieve this,\n  a LazyStack object is used in order to wrap the observations in the stack. For this reason, the\n  ObservationStackingFilter \nmust\n be the last filter in the inputs filters stack.\n\n\n\n\n\n\nObservationUint8Filter\n - Converts a floating point observation into an unsigned int 8 bit observation. This is\n  mostly useful for reducing memory consumption and is usually used for image observations. The filter will first\n  spread the observation values over the range 0-255 and then discretize them into integer values.\n\n\n\n\n\n\nReward Filters\n\n\n\n\n\n\nRewardClippingFilter\n - Clips the reward values into a given range. For example, in DQN, the Atari rewards are\n  clipped into the range -1 and 1 in order to control the scale of the returns.\n\n\n\n\n\n\nRewardNormalizationFilter\n -  Normalizes the reward values with a running mean and standard deviation of\n  all the rewards seen so far. When working with multiple workers, the statistics used for the normalization operation\n  are accumulated over all the workers.\n\n\n\n\n\n\nRewardRescaleFilter\n - Rescales the reward by a given factor. Rescaling the rewards of the environment has been\n  observed to have a large effect (negative or positive) on the behavior of the learning process.\n\n\n\n\n\n\nOutput Filters\n\n\nThe output filters only process the actions.\n\n\nAction Filters\n\n\n\n\n\n\nAttentionDiscretization\n - Discretizes an \nAttentionActionSpace\n. The attention action space defines the actions\n  as choosing sub-boxes in a given box. For example, consider an image of size 100x100, where the action is choosing\n  a crop window of size 20x20 to attend to in the image. AttentionDiscretization allows discretizing the possible crop\n  windows to choose into a finite number of options, and map a discrete action space into those crop windows.\n\n\n\n\n\n\nBoxDiscretization\n - Discretizes a continuous action space into a discrete action space, allowing the usage of\n  agents such as DQN for continuous environments such as MuJoCo. Given the number of bins to discretize into, the\n  original continuous action space is uniformly separated into the given number of bins, each mapped to a discrete\n  action index. For example, if the original actions space is between -1 and 1 and 5 bins were selected, the new action\n  space will consist of 5 actions mapped to -1, -0.5, 0, 0.5 and 1.\n\n\n\n\n\n\nBoxMasking\n - Masks part of the action space to enforce the agent to work in a defined space. For example,\n  if the original action space is between -1 and 1, then this filter can be used in order to constrain the agent actions\n  to the range 0 and 1 instead. This essentially masks the range -1 and 0 from the agent.\n\n\n\n\n\n\nPartialDiscreteActionSpaceMap\n - Partial map of two countable action spaces. For example, consider an environment\n  with a MultiSelect action space (select multiple actions at the same time, such as jump and go right), with 8 actual\n  MultiSelect actions. If we want the agent to be able to select only 5 of those actions by their index (0-4), we can\n  map a discrete action space with 5 actions into the 5 selected MultiSelect actions. This will both allow the agent to\n  use regular discrete actions, and mask 3 of the actions from the agent.\n\n\n\n\n\n\nFullDiscreteActionSpaceMap\n - Full map of two countable action spaces. This works in a similar way to the\n  PartialDiscreteActionSpaceMap, but maps the entire source action space into the entire target action space, without\n  masking any actions.\n\n\n\n\n\n\nLinearBoxToBoxMap\n - A linear mapping of two box action spaces. For example, if the action space of the\n  environment consists of continuous actions between 0 and 1, and we want the agent to choose actions between -1 and 1,\n  the LinearBoxToBoxMap can be used to map the range -1 and 1 to the range 0 and 1 in a linear way. This means that the\n  action -1 will be mapped to 0, the action 1 will be mapped to 1, and the rest of the actions will be linearly mapped\n  between those values.",
+            "title": "Filters"
+        },
+        {
+            "location": "/design/filters/#filters",
+            "text": "Filters are a mechanism in Coach that allows doing pre-processing and post-processing of the internal agent information.\nThere are two filter categories -    Input filters  - these are filters that process the information passed  into  the agent from the environment.\n  This information includes the observation and the reward. Input filters therefore allow rescaling observations,\n  normalizing rewards, stack observations, etc.    Output filters  - these are filters that process the information going  out  of the agent into the environment.\n  This information includes the action the agent chooses to take. Output filters therefore allow conversion of\n  actions from one space into another. For example, the agent can take   N   discrete actions, that will be mapped by\n  the output filter onto   N   continuous actions.    Filters can be stacked on top of each other in order to build complex processing flows of the inputs or outputs.",
+            "title": "Filters"
+        },
+        {
+            "location": "/design/filters/#input-filters",
+            "text": "The input filters are separated into two categories -  observation filters  and  reward filters .",
+            "title": "Input Filters"
+        },
+        {
+            "location": "/design/filters/#observation-filters",
+            "text": "ObservationClippingFilter  - Clips the observation values to a given range of values. For example, if the\n  observation consists of measurements in an arbitrary range, and we want to control the minimum and maximum values\n  of these observations, we can define a range and clip the values of the measurements.    ObservationCropFilter  - Crops the size of the observation to a given crop window. For example, in Atari, the\n  observations are images with a shape of 210x160. Usually, we will want to crop the size of the observation to a\n  square of 160x160 before rescaling them.    ObservationMoveAxisFilter  - Reorders the axes of the observation. This can be useful when the observation is an\n  image, and we want to move the channel axis to be the last axis instead of the first axis.    ObservationNormalizationFilter  - Normalizes the observation values with a running mean and standard deviation of\n  all the observations seen so far. The normalization is performed element-wise. Additionally, when working with\n  multiple workers, the statistics used for the normalization operation are accumulated over all the workers.    ObservationReductionBySubPartsNameFilter  - Allows keeping only parts of the observation, by specifying their\n  name. For example, the CARLA environment extracts multiple measurements that can be used by the agent, such as\n  speed and location. If we want to only use the speed, it can be done using this filter.    ObservationRescaleSizeByFactorFilter  - Rescales an image observation by some factor. For example, the image size\n  can be reduced by a factor of 2.    ObservationRescaleToSizeFilter  - Rescales an image observation to a given size. The target size does not\n  necessarily keep the aspect ratio of the original observation.    ObservationRGBToYFilter  - Converts a color image observation specified using the RGB encoding into a grayscale\n  image observation, by keeping only the luminance (Y) channel of the YUV encoding. This can be useful if the colors\n  in the original image are not relevant for solving the task at hand.    ObservationSqueezeFilter  - Removes redundant axes from the observation, which are axes with a dimension of 1.    ObservationStackingFilter  - Stacks several observations on top of each other. For image observation this will\n  create a 3D blob. The stacking is done in a lazy manner in order to reduce memory consumption. To achieve this,\n  a LazyStack object is used in order to wrap the observations in the stack. For this reason, the\n  ObservationStackingFilter  must  be the last filter in the inputs filters stack.    ObservationUint8Filter  - Converts a floating point observation into an unsigned int 8 bit observation. This is\n  mostly useful for reducing memory consumption and is usually used for image observations. The filter will first\n  spread the observation values over the range 0-255 and then discretize them into integer values.",
+            "title": "Observation Filters"
+        },
+        {
+            "location": "/design/filters/#reward-filters",
+            "text": "RewardClippingFilter  - Clips the reward values into a given range. For example, in DQN, the Atari rewards are\n  clipped into the range -1 and 1 in order to control the scale of the returns.    RewardNormalizationFilter  -  Normalizes the reward values with a running mean and standard deviation of\n  all the rewards seen so far. When working with multiple workers, the statistics used for the normalization operation\n  are accumulated over all the workers.    RewardRescaleFilter  - Rescales the reward by a given factor. Rescaling the rewards of the environment has been\n  observed to have a large effect (negative or positive) on the behavior of the learning process.",
+            "title": "Reward Filters"
+        },
+        {
+            "location": "/design/filters/#output-filters",
+            "text": "The output filters only process the actions.",
+            "title": "Output Filters"
+        },
+        {
+            "location": "/design/filters/#action-filters",
+            "text": "AttentionDiscretization  - Discretizes an  AttentionActionSpace . The attention action space defines the actions\n  as choosing sub-boxes in a given box. For example, consider an image of size 100x100, where the action is choosing\n  a crop window of size 20x20 to attend to in the image. AttentionDiscretization allows discretizing the possible crop\n  windows to choose into a finite number of options, and map a discrete action space into those crop windows.    BoxDiscretization  - Discretizes a continuous action space into a discrete action space, allowing the usage of\n  agents such as DQN for continuous environments such as MuJoCo. Given the number of bins to discretize into, the\n  original continuous action space is uniformly separated into the given number of bins, each mapped to a discrete\n  action index. For example, if the original actions space is between -1 and 1 and 5 bins were selected, the new action\n  space will consist of 5 actions mapped to -1, -0.5, 0, 0.5 and 1.    BoxMasking  - Masks part of the action space to enforce the agent to work in a defined space. For example,\n  if the original action space is between -1 and 1, then this filter can be used in order to constrain the agent actions\n  to the range 0 and 1 instead. This essentially masks the range -1 and 0 from the agent.    PartialDiscreteActionSpaceMap  - Partial map of two countable action spaces. For example, consider an environment\n  with a MultiSelect action space (select multiple actions at the same time, such as jump and go right), with 8 actual\n  MultiSelect actions. If we want the agent to be able to select only 5 of those actions by their index (0-4), we can\n  map a discrete action space with 5 actions into the 5 selected MultiSelect actions. This will both allow the agent to\n  use regular discrete actions, and mask 3 of the actions from the agent.    FullDiscreteActionSpaceMap  - Full map of two countable action spaces. This works in a similar way to the\n  PartialDiscreteActionSpaceMap, but maps the entire source action space into the entire target action space, without\n  masking any actions.    LinearBoxToBoxMap  - A linear mapping of two box action spaces. For example, if the action space of the\n  environment consists of continuous actions between 0 and 1, and we want the agent to choose actions between -1 and 1,\n  the LinearBoxToBoxMap can be used to map the range -1 and 1 to the range 0 and 1 in a linear way. This means that the\n  action -1 will be mapped to 0, the action 1 will be mapped to 1, and the rest of the actions will be linearly mapped\n  between those values.",
+            "title": "Action Filters"
+        },
+        {
+            "location": "/algorithms/value_optimization/dqn/",
+            "text": "Deep Q Networks\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nPlaying Atari with Deep Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\n\n\nSample a batch of transitions from the replay buffer. \n\n\nUsing the next states from the sampled batch, run the target network to calculate the \n Q \n values for each of the actions \n Q(s_{t+1},a) \n, and keep only the maximum value for each state. \n\n\nIn order to zero out the updates for the actions that were not played (resulting from zeroing the MSE loss), use the current states from the sampled batch, and run the online network to get the current Q values predictions. Set those values as the targets for the actions that were not actually played. \n\n\n\n\nFor each action that was played, use the following equation for calculating the targets of the network:\u200b                                                         \n y_t=r(s_t,a_t)+\u03b3\\cdot max_a {Q(s_{t+1},a)} \n\n\n\n\n\n\n\n\nFinally, train the online network using the current states as inputs, and with the aforementioned targets. \n\n\n\n\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "DQN"
+        },
+        {
+            "location": "/algorithms/value_optimization/dqn/#deep-q-networks",
+            "text": "Actions space:  Discrete  References:   Playing Atari with Deep Reinforcement Learning",
+            "title": "Deep Q Networks"
+        },
+        {
+            "location": "/algorithms/value_optimization/dqn/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/dqn/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/dqn/#training-the-network",
+            "text": "Sample a batch of transitions from the replay buffer.   Using the next states from the sampled batch, run the target network to calculate the   Q   values for each of the actions   Q(s_{t+1},a)  , and keep only the maximum value for each state.   In order to zero out the updates for the actions that were not played (resulting from zeroing the MSE loss), use the current states from the sampled batch, and run the online network to get the current Q values predictions. Set those values as the targets for the actions that were not actually played.    For each action that was played, use the following equation for calculating the targets of the network:\u200b                                                           y_t=r(s_t,a_t)+\u03b3\\cdot max_a {Q(s_{t+1},a)}      Finally, train the online network using the current states as inputs, and with the aforementioned targets.    Once in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/value_optimization/double_dqn/",
+            "text": "Double DQN\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nDeep Reinforcement Learning with Double Q-learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\n\n\nSample a batch of transitions from the replay buffer. \n\n\nUsing the next states from the sampled batch, run the online network in order to find the \nQ\n maximizing action \nargmax_a Q(s_{t+1},a)\n. For these actions, use the corresponding next states and run the target network to calculate \nQ(s_{t+1},argmax_a Q(s_{t+1},a))\n.\n\n\nIn order to zero out the updates for the actions that were not played (resulting from zeroing the MSE loss), use the current states from the sampled batch, and run the online network to get the current Q values predictions. Set those values as the targets for the actions that were not actually played. \n\n\n\n\nFor each action that was played, use the following equation for calculating the targets of the network:\n   \n y_t=r(s_t,a_t )+\\gamma \\cdot Q(s_{t+1},argmax_a Q(s_{t+1},a)) \n\n\n\n\n\n\n\n\nFinally, train the online network using the current states as inputs, and with the aforementioned targets. \n\n\n\n\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "Double DQN"
+        },
+        {
+            "location": "/algorithms/value_optimization/double_dqn/#double-dqn",
+            "text": "Actions space:  Discrete  References:   Deep Reinforcement Learning with Double Q-learning",
+            "title": "Double DQN"
+        },
+        {
+            "location": "/algorithms/value_optimization/double_dqn/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/double_dqn/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/double_dqn/#training-the-network",
+            "text": "Sample a batch of transitions from the replay buffer.   Using the next states from the sampled batch, run the online network in order to find the  Q  maximizing action  argmax_a Q(s_{t+1},a) . For these actions, use the corresponding next states and run the target network to calculate  Q(s_{t+1},argmax_a Q(s_{t+1},a)) .  In order to zero out the updates for the actions that were not played (resulting from zeroing the MSE loss), use the current states from the sampled batch, and run the online network to get the current Q values predictions. Set those values as the targets for the actions that were not actually played.    For each action that was played, use the following equation for calculating the targets of the network:\n     y_t=r(s_t,a_t )+\\gamma \\cdot Q(s_{t+1},argmax_a Q(s_{t+1},a))      Finally, train the online network using the current states as inputs, and with the aforementioned targets.    Once in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/value_optimization/dueling_dqn/",
+            "text": "Dueling DQN\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nDueling Network Architectures for Deep Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nGeneral Description\n\n\nDueling DQN presents a change in the network structure comparing to DQN.\n\n\nDueling DQN uses a specialized \nDueling Q Head\n in order to separate \n Q \n to an \n A \n (advantage) stream and a \n V \n stream. Adding this type of structure to the network head allows the network to better differentiate actions from one another, and significantly improves the learning.\n\n\nIn many states, the values of the different actions are very similar, and it is less important which action to take.\nThis is especially important in environments where there are many actions to choose from. In DQN, on each training iteration, for each of the states in the batch, we update the \nQ\n values only for the specific actions taken in those states. This results in slower learning as we do not learn the \nQ\n values for actions that were not taken yet. On dueling architecture, on the other hand, learning is faster - as we start learning the state-value even if only a single action has been taken at this state.",
+            "title": "Dueling DQN"
+        },
+        {
+            "location": "/algorithms/value_optimization/dueling_dqn/#dueling-dqn",
+            "text": "Actions space:  Discrete  References:   Dueling Network Architectures for Deep Reinforcement Learning",
+            "title": "Dueling DQN"
+        },
+        {
+            "location": "/algorithms/value_optimization/dueling_dqn/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/dueling_dqn/#general-description",
+            "text": "Dueling DQN presents a change in the network structure comparing to DQN.  Dueling DQN uses a specialized  Dueling Q Head  in order to separate   Q   to an   A   (advantage) stream and a   V   stream. Adding this type of structure to the network head allows the network to better differentiate actions from one another, and significantly improves the learning.  In many states, the values of the different actions are very similar, and it is less important which action to take.\nThis is especially important in environments where there are many actions to choose from. In DQN, on each training iteration, for each of the states in the batch, we update the  Q  values only for the specific actions taken in those states. This results in slower learning as we do not learn the  Q  values for actions that were not taken yet. On dueling architecture, on the other hand, learning is faster - as we start learning the state-value even if only a single action has been taken at this state.",
+            "title": "General Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/categorical_dqn/",
+            "text": "Categorical DQN\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nA Distributional Perspective on Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\n\n\nSample a batch of transitions from the replay buffer. \n\n\n\n\nThe Bellman update is projected to the set of atoms representing the \n Q \n values distribution, such that the \ni-th\n component of the projected update is calculated as follows:\n   \n (\\Phi \\hat{T} Z_{\\theta}(s_t,a_t))_i=\\sum_{j=0}^{N-1}\\Big[1-\\frac{|[\\hat{T}_{z_{j}}]^{V_{MAX}}_{V_{MIN}}-z_i|}{\\Delta z}\\Big]^1_0 \\ p_j(s_{t+1}, \\pi(s_{t+1})) \n\n   where:\n\n\n\n\n\n\n[ \\cdot ] \n bounds its argument in the range [a, b]\n\n\n\n\n\\hat{T}_{z_{j}}\n is the Bellman update for atom \nz_j\n: \u00a0 \u00a0   \n\\hat{T}_{z_{j}} := r+\\gamma z_j\n\n\n\n\n\n\n\n\n\n\nNetwork is trained with the cross entropy loss between the resulting probability distribution and the target probability distribution.   Only the target of the actions that were actually taken is updated. \n\n\n\n\nOnce in every few thousand steps, weights are copied from the online network to the target network.",
+            "title": "Categorical DQN"
+        },
+        {
+            "location": "/algorithms/value_optimization/categorical_dqn/#categorical-dqn",
+            "text": "Actions space:  Discrete  References:   A Distributional Perspective on Reinforcement Learning",
+            "title": "Categorical DQN"
+        },
+        {
+            "location": "/algorithms/value_optimization/categorical_dqn/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/categorical_dqn/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/categorical_dqn/#training-the-network",
+            "text": "Sample a batch of transitions from the replay buffer.    The Bellman update is projected to the set of atoms representing the   Q   values distribution, such that the  i-th  component of the projected update is calculated as follows:\n     (\\Phi \\hat{T} Z_{\\theta}(s_t,a_t))_i=\\sum_{j=0}^{N-1}\\Big[1-\\frac{|[\\hat{T}_{z_{j}}]^{V_{MAX}}_{V_{MIN}}-z_i|}{\\Delta z}\\Big]^1_0 \\ p_j(s_{t+1}, \\pi(s_{t+1}))  \n   where:    [ \\cdot ]   bounds its argument in the range [a, b]   \\hat{T}_{z_{j}}  is the Bellman update for atom  z_j : \u00a0 \u00a0    \\hat{T}_{z_{j}} := r+\\gamma z_j      Network is trained with the cross entropy loss between the resulting probability distribution and the target probability distribution.   Only the target of the actions that were actually taken is updated.    Once in every few thousand steps, weights are copied from the online network to the target network.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/value_optimization/mmc/",
+            "text": "Mixed Monte Carlo\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nCount-Based Exploration with Neural Density Models\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\nIn MMC, targets are calculated as a mixture between Double DQN targets and full Monte Carlo samples (total discounted returns).\n\n\nThe DDQN targets are calculated in the same manner as in the DDQN agent:\n\n\n\n\n y_t^{DDQN}=r(s_t,a_t )+\\gamma Q(s_{t+1},argmax_a Q(s_{t+1},a)) \n\n\n\n\nThe Monte Carlo targets are calculated by summing up the discounted rewards across the entire episode:\n\n\n\n\n y_t^{MC}=\\sum_{j=0}^T\\gamma^j r(s_{t+j},a_{t+j} ) \n\n\n\n\nA mixing ratio \n\\alpha\n is then used to get the final targets:\n\n\n\n\n y_t=(1-\\alpha)\\cdot y_t^{DDQN}+\\alpha \\cdot y_t^{MC} \n\n\n\n\nFinally, the online network is trained using the current states as inputs, and the calculated targets.\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "Mixed Monte Carlo"
+        },
+        {
+            "location": "/algorithms/value_optimization/mmc/#mixed-monte-carlo",
+            "text": "Actions space:  Discrete  References:   Count-Based Exploration with Neural Density Models",
+            "title": "Mixed Monte Carlo"
+        },
+        {
+            "location": "/algorithms/value_optimization/mmc/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/mmc/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/mmc/#training-the-network",
+            "text": "In MMC, targets are calculated as a mixture between Double DQN targets and full Monte Carlo samples (total discounted returns).  The DDQN targets are calculated in the same manner as in the DDQN agent:    y_t^{DDQN}=r(s_t,a_t )+\\gamma Q(s_{t+1},argmax_a Q(s_{t+1},a))    The Monte Carlo targets are calculated by summing up the discounted rewards across the entire episode:    y_t^{MC}=\\sum_{j=0}^T\\gamma^j r(s_{t+j},a_{t+j} )    A mixing ratio  \\alpha  is then used to get the final targets:    y_t=(1-\\alpha)\\cdot y_t^{DDQN}+\\alpha \\cdot y_t^{MC}    Finally, the online network is trained using the current states as inputs, and the calculated targets.\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/value_optimization/pal/",
+            "text": "Persistent Advantage Learning\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nIncreasing the Action Gap: New Operators for Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\n\n\n\n\nSample a batch of transitions from the replay buffer. \n\n\n\n\n\n\nStart by calculating the initial target values in the same manner as they are calculated in DDQN\n   \n y_t^{DDQN}=r(s_t,a_t )+\\gamma Q(s_{t+1},argmax_a Q(s_{t+1},a)) \n\n\n\n\n\n\nThe action gap \n V(s_t )-Q(s_t,a_t) \n should then be subtracted from each of the calculated targets. To calculate the action gap, run the target network using the current states and get the \n Q \n values for all the actions. Then estimate \n V \n as the maximum predicted \n Q \n value for the current state:\n   \n V(s_t )=max_a Q(s_t,a) \n\n\n\n\nFor \nadvantage learning (AL)\n, reduce the action gap weighted by a predefined parameter \n \\alpha \n from the targets \n y_t^{DDQN} \n: \n   \n y_t=y_t^{DDQN}-\\alpha \\cdot (V(s_t )-Q(s_t,a_t )) \n\n\n\n\nFor \npersistent advantage learning (PAL)\n, the target network is also used in order to calculate the action gap for the next state:\n   \n V(s_{t+1} )-Q(s_{t+1},a_{t+1}) \n\n   where \n a_{t+1} \n is chosen by running the next states through the online network and choosing the action that has the highest predicted \n Q \n value. Finally, the targets will be defined as -\n   \n y_t=y_t^{DDQN}-\\alpha \\cdot min(V(s_t )-Q(s_t,a_t ),V(s_{t+1} )-Q(s_{t+1},a_{t+1} )) \n\n\n\n\n\n\nTrain the online network using the current states as inputs, and with the aforementioned targets.\n\n\n\n\n\n\nOnce in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "Persistent Advantage Learning"
+        },
+        {
+            "location": "/algorithms/value_optimization/pal/#persistent-advantage-learning",
+            "text": "Actions space:  Discrete  References:   Increasing the Action Gap: New Operators for Reinforcement Learning",
+            "title": "Persistent Advantage Learning"
+        },
+        {
+            "location": "/algorithms/value_optimization/pal/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/pal/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/pal/#training-the-network",
+            "text": "Sample a batch of transitions from the replay buffer.     Start by calculating the initial target values in the same manner as they are calculated in DDQN\n     y_t^{DDQN}=r(s_t,a_t )+\\gamma Q(s_{t+1},argmax_a Q(s_{t+1},a))     The action gap   V(s_t )-Q(s_t,a_t)   should then be subtracted from each of the calculated targets. To calculate the action gap, run the target network using the current states and get the   Q   values for all the actions. Then estimate   V   as the maximum predicted   Q   value for the current state:\n     V(s_t )=max_a Q(s_t,a)    For  advantage learning (AL) , reduce the action gap weighted by a predefined parameter   \\alpha   from the targets   y_t^{DDQN}  : \n     y_t=y_t^{DDQN}-\\alpha \\cdot (V(s_t )-Q(s_t,a_t ))    For  persistent advantage learning (PAL) , the target network is also used in order to calculate the action gap for the next state:\n     V(s_{t+1} )-Q(s_{t+1},a_{t+1})  \n   where   a_{t+1}   is chosen by running the next states through the online network and choosing the action that has the highest predicted   Q   value. Finally, the targets will be defined as -\n     y_t=y_t^{DDQN}-\\alpha \\cdot min(V(s_t )-Q(s_t,a_t ),V(s_{t+1} )-Q(s_{t+1},a_{t+1} ))     Train the online network using the current states as inputs, and with the aforementioned targets.    Once in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/value_optimization/nec/",
+            "text": "Neural Episodic Control\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nNeural Episodic Control\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\n\n\nUse the current state as an input to the online network and extract the state embedding, which is the intermediate output from the middleware. \n\n\nFor each possible action \na_i\n, run the DND head using the state embedding and the selected action \na_i\n as inputs. The DND is queried and returns the \n P \n nearest neighbor keys and values. The keys and values are used to calculate and return the action \n Q \n value from the network. \n\n\nPass all the \n Q \n values to the exploration policy and choose an action accordingly. \n\n\nStore the state embeddings and actions taken during the current episode in a small buffer \nB\n, in order to accumulate transitions until it is possible to calculate the total discounted returns over the entire episode.\n\n\n\n\nFinalizing an episode\n\n\nFor each step in the episode, the state embeddings and the taken actions are stored in the buffer \nB\n. When the episode is finished, the replay buffer calculates the \n N \n-step total return of each transition in the buffer, bootstrapped using the maximum \nQ\n value of the \nN\n-th transition. Those values are inserted along with the total return into the DND, and the buffer \nB\n is reset.\n\n\nTraining the network\n\n\nTrain the network only when the DND has enough entries for querying.\n\n\nTo train the network, the current states are used as the inputs and the \nN\n-step returns are used as the targets. The \nN\n-step return used takes into account \n N \n consecutive steps, and bootstraps the last value from the network if necessary:\n\n y_t=\\sum_{j=0}^{N-1}\\gamma^j r(s_{t+j},a_{t+j} ) +\\gamma^N   max_a Q(s_{t+N},a)",
+            "title": "Neural Episodic Control"
+        },
+        {
+            "location": "/algorithms/value_optimization/nec/#neural-episodic-control",
+            "text": "Actions space:  Discrete  References:   Neural Episodic Control",
+            "title": "Neural Episodic Control"
+        },
+        {
+            "location": "/algorithms/value_optimization/nec/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/nec/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/nec/#choosing-an-action",
+            "text": "Use the current state as an input to the online network and extract the state embedding, which is the intermediate output from the middleware.   For each possible action  a_i , run the DND head using the state embedding and the selected action  a_i  as inputs. The DND is queried and returns the   P   nearest neighbor keys and values. The keys and values are used to calculate and return the action   Q   value from the network.   Pass all the   Q   values to the exploration policy and choose an action accordingly.   Store the state embeddings and actions taken during the current episode in a small buffer  B , in order to accumulate transitions until it is possible to calculate the total discounted returns over the entire episode.",
+            "title": "Choosing an action"
+        },
+        {
+            "location": "/algorithms/value_optimization/nec/#finalizing-an-episode",
+            "text": "For each step in the episode, the state embeddings and the taken actions are stored in the buffer  B . When the episode is finished, the replay buffer calculates the   N  -step total return of each transition in the buffer, bootstrapped using the maximum  Q  value of the  N -th transition. Those values are inserted along with the total return into the DND, and the buffer  B  is reset.",
+            "title": "Finalizing an episode"
+        },
+        {
+            "location": "/algorithms/value_optimization/nec/#training-the-network",
+            "text": "Train the network only when the DND has enough entries for querying.  To train the network, the current states are used as the inputs and the  N -step returns are used as the targets. The  N -step return used takes into account   N   consecutive steps, and bootstraps the last value from the network if necessary:  y_t=\\sum_{j=0}^{N-1}\\gamma^j r(s_{t+j},a_{t+j} ) +\\gamma^N   max_a Q(s_{t+N},a)",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/value_optimization/bs_dqn/",
+            "text": "Bootstrapped DQN\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nDeep Exploration via Bootstrapped DQN\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\nThe current states are used as the input to the network. The network contains several \nQ\n heads, which  are used for returning different estimations of the action \n Q \n values. For each episode, the bootstrapped exploration policy selects a single head to play with during the episode. According to the selected head, only the relevant output \n Q \n values are used. Using those \n Q \n values, the exploration policy then selects the action for acting.\n\n\nStoring the transitions\n\n\nFor each transition, a Binomial mask is generated according to a predefined probability, and the number of output heads. The mask is a binary vector where each element holds a 0 for heads that shouldn't train on the specific transition, and 1 for heads that should use the transition for training. The mask is stored as part of the transition info in the replay buffer. \n\n\nTraining the network\n\n\nFirst, sample a batch of transitions from the replay buffer. Run the current states through the network and get the current \n Q \n value predictions for all the heads and all the actions. For each transition in the batch, and for each output head, if the transition mask is 1 - change the targets of the played action to \ny_t\n, according to the standard DQN update rule:\n\n\n\n\n y_t=r(s_t,a_t )+\\gamma\\cdot max_a Q(s_{t+1},a) \n\n\n\n\nOtherwise, leave it intact so that the transition does not affect the learning of this head. Then, train the online network according to the calculated targets.\n\n\nAs in DQN, once in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "Bootstrapped DQN"
+        },
+        {
+            "location": "/algorithms/value_optimization/bs_dqn/#bootstrapped-dqn",
+            "text": "Actions space:  Discrete  References:   Deep Exploration via Bootstrapped DQN",
+            "title": "Bootstrapped DQN"
+        },
+        {
+            "location": "/algorithms/value_optimization/bs_dqn/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/bs_dqn/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/bs_dqn/#choosing-an-action",
+            "text": "The current states are used as the input to the network. The network contains several  Q  heads, which  are used for returning different estimations of the action   Q   values. For each episode, the bootstrapped exploration policy selects a single head to play with during the episode. According to the selected head, only the relevant output   Q   values are used. Using those   Q   values, the exploration policy then selects the action for acting.",
+            "title": "Choosing an action"
+        },
+        {
+            "location": "/algorithms/value_optimization/bs_dqn/#storing-the-transitions",
+            "text": "For each transition, a Binomial mask is generated according to a predefined probability, and the number of output heads. The mask is a binary vector where each element holds a 0 for heads that shouldn't train on the specific transition, and 1 for heads that should use the transition for training. The mask is stored as part of the transition info in the replay buffer.",
+            "title": "Storing the transitions"
+        },
+        {
+            "location": "/algorithms/value_optimization/bs_dqn/#training-the-network",
+            "text": "First, sample a batch of transitions from the replay buffer. Run the current states through the network and get the current   Q   value predictions for all the heads and all the actions. For each transition in the batch, and for each output head, if the transition mask is 1 - change the targets of the played action to  y_t , according to the standard DQN update rule:    y_t=r(s_t,a_t )+\\gamma\\cdot max_a Q(s_{t+1},a)    Otherwise, leave it intact so that the transition does not affect the learning of this head. Then, train the online network according to the calculated targets.  As in DQN, once in every few thousand steps, copy the weights from the online network to the target network.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/value_optimization/n_step/",
+            "text": "N-Step Q Learning\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nAsynchronous Methods for Deep Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\nThe \nN\n-step Q learning algorithm works in similar manner to DQN except for the following changes:\n\n\n\n\n\n\nNo replay buffer is used. Instead of sampling random batches of transitions, the network is trained every \nN\n steps using the latest \nN\n steps played by the agent.\n\n\n\n\n\n\nIn order to stabilize the learning, multiple workers work together to update the network. This creates the same effect as uncorrelating the samples used for training.\n\n\n\n\n\n\nInstead of using single-step Q targets for the network, the rewards from \nN\n consequent steps are accumulated to form the \nN\n-step Q targets, according to the following equation: \n\nR(s_t, a_t) = \\sum_{i=t}^{i=t + k - 1} \\gamma^{i-t}r_i +\\gamma^{k} V(s_{t+k})\n\nwhere \nk\n is \nT_{max} - State\\_Index\n for each state in the batch",
+            "title": "N-Step Q Learning"
+        },
+        {
+            "location": "/algorithms/value_optimization/n_step/#n-step-q-learning",
+            "text": "Actions space:  Discrete  References:   Asynchronous Methods for Deep Reinforcement Learning",
+            "title": "N-Step Q Learning"
+        },
+        {
+            "location": "/algorithms/value_optimization/n_step/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/n_step/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/n_step/#training-the-network",
+            "text": "The  N -step Q learning algorithm works in similar manner to DQN except for the following changes:    No replay buffer is used. Instead of sampling random batches of transitions, the network is trained every  N  steps using the latest  N  steps played by the agent.    In order to stabilize the learning, multiple workers work together to update the network. This creates the same effect as uncorrelating the samples used for training.    Instead of using single-step Q targets for the network, the rewards from  N  consequent steps are accumulated to form the  N -step Q targets, according to the following equation:  R(s_t, a_t) = \\sum_{i=t}^{i=t + k - 1} \\gamma^{i-t}r_i +\\gamma^{k} V(s_{t+k}) \nwhere  k  is  T_{max} - State\\_Index  for each state in the batch",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/value_optimization/naf/",
+            "text": "Normalized Advantage Functions\n\n\nActions space:\n Continuous\n\n\nReferences:\n \nContinuous Deep Q-Learning with Model-based Acceleration\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\nThe current state is used as an input to the network. The action mean \n \\mu(s_t ) \n is extracted from the output head. It is then passed to the exploration policy which adds noise in order to encourage exploration.\n\n\nTraining the network\n\n\nThe network is trained by using the following targets:\n\n y_t=r(s_t,a_t )+\\gamma\\cdot V(s_{t+1}) \n\nUse the next states as the inputs to the target network and extract the \n V \n value, from within the head, to get \n V(s_{t+1} ) \n. Then, update the online network using the current states and actions as inputs, and \n y_t \n as the targets.\nAfter every training step, use a soft update in order to copy the weights from the online network to the target network.",
+            "title": "Normalized Advantage Functions"
+        },
+        {
+            "location": "/algorithms/value_optimization/naf/#normalized-advantage-functions",
+            "text": "Actions space:  Continuous  References:   Continuous Deep Q-Learning with Model-based Acceleration",
+            "title": "Normalized Advantage Functions"
+        },
+        {
+            "location": "/algorithms/value_optimization/naf/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/value_optimization/naf/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/value_optimization/naf/#choosing-an-action",
+            "text": "The current state is used as an input to the network. The action mean   \\mu(s_t )   is extracted from the output head. It is then passed to the exploration policy which adds noise in order to encourage exploration.",
+            "title": "Choosing an action"
+        },
+        {
+            "location": "/algorithms/value_optimization/naf/#training-the-network",
+            "text": "The network is trained by using the following targets:  y_t=r(s_t,a_t )+\\gamma\\cdot V(s_{t+1})  \nUse the next states as the inputs to the target network and extract the   V   value, from within the head, to get   V(s_{t+1} )  . Then, update the online network using the current states and actions as inputs, and   y_t   as the targets.\nAfter every training step, use a soft update in order to copy the weights from the online network to the target network.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/policy_optimization/pg/",
+            "text": "Policy Gradient\n\n\nActions space:\n Discrete|Continuous\n\n\nReferences:\n \nSimple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action - Discrete actions\n\n\nRun the current states through the network and get a policy distribution over the actions. While training, sample from the policy distribution. When testing, take the action with the highest probability. \n\n\nTraining the network\n\n\nThe policy head loss is defined as \n L=-log (\\pi) \\cdot  PolicyGradientRescaler \n. The \nPolicyGradientRescaler\n is used in order to reduce the policy gradient variance, which might be very noisy. This is done in order to reduce the variance of the updates, since noisy gradient updates might destabilize the policy's convergence. The rescaler is a configurable parameter and there are few options to choose from:  \n\n\n \nTotal Episode Return\n - The sum of all the discounted rewards during the episode.\n\n \nFuture Return\n - Return from each transition until the end of the episode.\n\n \nFuture Return Normalized by Episode\n - Future returns across the episode normalized by the episode's mean and standard deviation.\n\n \nFuture Return Normalized by Timestep\n - Future returns normalized using running means and standard deviations, which are calculated seperately for each timestep, across different episodes. \n\n\nGradients are accumulated over a number of full played episodes. The gradients accumulation over several episodes serves the same purpose - reducing the update variance. After accumulating gradients for several episodes, the gradients are then applied to the network.",
+            "title": "Policy Gradient"
+        },
+        {
+            "location": "/algorithms/policy_optimization/pg/#policy-gradient",
+            "text": "Actions space:  Discrete|Continuous  References:   Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning",
+            "title": "Policy Gradient"
+        },
+        {
+            "location": "/algorithms/policy_optimization/pg/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/policy_optimization/pg/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/policy_optimization/pg/#choosing-an-action-discrete-actions",
+            "text": "Run the current states through the network and get a policy distribution over the actions. While training, sample from the policy distribution. When testing, take the action with the highest probability.",
+            "title": "Choosing an action - Discrete actions"
+        },
+        {
+            "location": "/algorithms/policy_optimization/pg/#training-the-network",
+            "text": "The policy head loss is defined as   L=-log (\\pi) \\cdot  PolicyGradientRescaler  . The  PolicyGradientRescaler  is used in order to reduce the policy gradient variance, which might be very noisy. This is done in order to reduce the variance of the updates, since noisy gradient updates might destabilize the policy's convergence. The rescaler is a configurable parameter and there are few options to choose from:      Total Episode Return  - The sum of all the discounted rewards during the episode.   Future Return  - Return from each transition until the end of the episode.   Future Return Normalized by Episode  - Future returns across the episode normalized by the episode's mean and standard deviation.   Future Return Normalized by Timestep  - Future returns normalized using running means and standard deviations, which are calculated seperately for each timestep, across different episodes.   Gradients are accumulated over a number of full played episodes. The gradients accumulation over several episodes serves the same purpose - reducing the update variance. After accumulating gradients for several episodes, the gradients are then applied to the network.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ac/",
+            "text": "Actor-Critic\n\n\nActions space:\n Discrete|Continuous\n\n\nReferences:\n \nAsynchronous Methods for Deep Reinforcement Learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action - Discrete actions\n\n\nThe policy network is used in order to predict action probabilites. While training, a sample is taken from a categorical distribution assigned with these probabilities. When testing, the action with the highest probability is used.\n\n\nTraining the network\n\n\nA batch of \n T_{max} \n transitions is used, and the advantages are calculated upon it.\n\n\nAdvantages can be calculated by either of the following methods (configured by the selected preset) -\n\n\n\n\nA_VALUE\n - Estimating advantage directly:\n A(s_t, a_t) = \\underbrace{\\sum_{i=t}^{i=t + k - 1} \\gamma^{i-t}r_i +\\gamma^{k} V(s_{t+k})}_{Q(s_t, a_t)} - V(s_t) \nwhere \nk\n is \nT_{max} - State\\_Index\n for each state in the batch.\n\n\nGAE\n - By following the \nGeneralized Advantage Estimation\n paper. \n\n\n\n\nThe advantages are then used in order to accumulate gradients according to \n\n L = -\\mathop{\\mathbb{E}} [log (\\pi) \\cdot A]",
+            "title": "Actor-Critic"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ac/#actor-critic",
+            "text": "Actions space:  Discrete|Continuous  References:   Asynchronous Methods for Deep Reinforcement Learning",
+            "title": "Actor-Critic"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ac/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ac/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ac/#choosing-an-action-discrete-actions",
+            "text": "The policy network is used in order to predict action probabilites. While training, a sample is taken from a categorical distribution assigned with these probabilities. When testing, the action with the highest probability is used.",
+            "title": "Choosing an action - Discrete actions"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ac/#training-the-network",
+            "text": "A batch of   T_{max}   transitions is used, and the advantages are calculated upon it.  Advantages can be calculated by either of the following methods (configured by the selected preset) -   A_VALUE  - Estimating advantage directly:  A(s_t, a_t) = \\underbrace{\\sum_{i=t}^{i=t + k - 1} \\gamma^{i-t}r_i +\\gamma^{k} V(s_{t+k})}_{Q(s_t, a_t)} - V(s_t)  where  k  is  T_{max} - State\\_Index  for each state in the batch.  GAE  - By following the  Generalized Advantage Estimation  paper.    The advantages are then used in order to accumulate gradients according to   L = -\\mathop{\\mathbb{E}} [log (\\pi) \\cdot A]",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ddpg/",
+            "text": "Deep Deterministic Policy Gradient\n\n\nActions space:\n Continuous\n\n\nReferences:\n \nContinuous control with deep reinforcement learning\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\nPass the current states through the actor network, and get an action mean vector \n \\mu \n. While in training phase, use a continuous exploration policy, such as the Ornstein-Uhlenbeck process, to add exploration noise to the action. When testing, use the mean vector \n\\mu\n as-is.\n\n\nTraining the network\n\n\nStart by sampling a batch of transitions from the experience replay.\n\n\n\n\nTo train the \ncritic network\n, use the following targets:\n\n\n\n\n\n\n y_t=r(s_t,a_t )+\\gamma \\cdot Q(s_{t+1},\\mu(s_{t+1} )) \n\n  First run the actor target network, using the next states as the inputs, and get \n \\mu (s_{t+1} ) \n. Next, run the critic target network using the next states and \n \\mu (s_{t+1} ) \n, and use the output to calculate \n y_t \n according to the equation above. To train the network, use the current states and actions as the inputs, and \ny_t\n as the targets.\n\n\n\n\nTo train the \nactor network\n, use the following equation:\n\n\n\n\n\n\n \\nabla_{\\theta^\\mu } J \\approx E_{s_t \\tilde{} \\rho^\\beta } [\\nabla_a Q(s,a)|_{s=s_t,a=\\mu (s_t ) } \\cdot \\nabla_{\\theta^\\mu} \\mu(s)|_{s=s_t} ] \n\n  Use the actor's online network to get the action mean values using the current states as the inputs. Then, use the critic online network in order to get the gradients of the critic output with respect to the action mean values \n \\nabla _a Q(s,a)|_{s=s_t,a=\\mu(s_t ) } \n. Using the chain rule, calculate the gradients of the actor's output, with respect to the actor weights, given \n \\nabla_a Q(s,a) \n. Finally, apply those gradients to the actor network.\n\n\nAfter every training step, do a soft update of the critic and actor target networks' weights from the online networks.",
+            "title": "Deep Determinstic Policy Gradients"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ddpg/#deep-deterministic-policy-gradient",
+            "text": "Actions space:  Continuous  References:   Continuous control with deep reinforcement learning",
+            "title": "Deep Deterministic Policy Gradient"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ddpg/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ddpg/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ddpg/#choosing-an-action",
+            "text": "Pass the current states through the actor network, and get an action mean vector   \\mu  . While in training phase, use a continuous exploration policy, such as the Ornstein-Uhlenbeck process, to add exploration noise to the action. When testing, use the mean vector  \\mu  as-is.",
+            "title": "Choosing an action"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ddpg/#training-the-network",
+            "text": "Start by sampling a batch of transitions from the experience replay.   To train the  critic network , use the following targets:     y_t=r(s_t,a_t )+\\gamma \\cdot Q(s_{t+1},\\mu(s_{t+1} ))  \n  First run the actor target network, using the next states as the inputs, and get   \\mu (s_{t+1} )  . Next, run the critic target network using the next states and   \\mu (s_{t+1} )  , and use the output to calculate   y_t   according to the equation above. To train the network, use the current states and actions as the inputs, and  y_t  as the targets.   To train the  actor network , use the following equation:     \\nabla_{\\theta^\\mu } J \\approx E_{s_t \\tilde{} \\rho^\\beta } [\\nabla_a Q(s,a)|_{s=s_t,a=\\mu (s_t ) } \\cdot \\nabla_{\\theta^\\mu} \\mu(s)|_{s=s_t} ]  \n  Use the actor's online network to get the action mean values using the current states as the inputs. Then, use the critic online network in order to get the gradients of the critic output with respect to the action mean values   \\nabla _a Q(s,a)|_{s=s_t,a=\\mu(s_t ) }  . Using the chain rule, calculate the gradients of the actor's output, with respect to the actor weights, given   \\nabla_a Q(s,a)  . Finally, apply those gradients to the actor network.  After every training step, do a soft update of the critic and actor target networks' weights from the online networks.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ppo/",
+            "text": "Proximal Policy Optimization\n\n\nActions space:\n Discrete|Continuous\n\n\nReferences:\n \nProximal Policy Optimization Algorithms\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action - Continuous actions\n\n\nRun the observation through the policy network, and get the mean and standard deviation vectors for this observation. While in training phase, sample from a multi-dimensional Gaussian distribution with these mean and standard deviation values. When testing, just take the mean values predicted by the network. \n\n\nTraining the network\n\n\n\n\nCollect a big chunk of experience (in the order of thousands of transitions, sampled from multiple episodes).\n\n\nCalculate the advantages for each transition, using the \nGeneralized Advantage Estimation\n method (Schulman '2015).  \n\n\nRun a single training iteration of the value network using an L-BFGS optimizer. Unlike first order optimizers, the L-BFGS optimizer runs on the entire dataset at once, without batching. It continues running until some low loss threshold is reached. To prevent overfitting to the current dataset, the value targets are updated in a soft manner, using an Exponentially Weighted Moving Average, based on the total discounted returns of each state in each episode.\n\n\nRun several training iterations of the policy network. This is done by using the previously calculated advantages as targets. The loss function penalizes policies that deviate too far from the old policy (the policy that was used \nbefore\n starting to run the current set of training iterations) using a regularization term. \n\n\nAfter training is done, the last sampled KL divergence value will be compared with the \ntarget KL divergence\n value, in order to adapt the penalty coefficient used in the policy loss. If the KL divergence went too high, increase the penalty, if it went too low, reduce it. Otherwise, leave it unchanged.",
+            "title": "Proximal Policy Optimization"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ppo/#proximal-policy-optimization",
+            "text": "Actions space:  Discrete|Continuous  References:   Proximal Policy Optimization Algorithms",
+            "title": "Proximal Policy Optimization"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ppo/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ppo/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ppo/#choosing-an-action-continuous-actions",
+            "text": "Run the observation through the policy network, and get the mean and standard deviation vectors for this observation. While in training phase, sample from a multi-dimensional Gaussian distribution with these mean and standard deviation values. When testing, just take the mean values predicted by the network.",
+            "title": "Choosing an action - Continuous actions"
+        },
+        {
+            "location": "/algorithms/policy_optimization/ppo/#training-the-network",
+            "text": "Collect a big chunk of experience (in the order of thousands of transitions, sampled from multiple episodes).  Calculate the advantages for each transition, using the  Generalized Advantage Estimation  method (Schulman '2015).    Run a single training iteration of the value network using an L-BFGS optimizer. Unlike first order optimizers, the L-BFGS optimizer runs on the entire dataset at once, without batching. It continues running until some low loss threshold is reached. To prevent overfitting to the current dataset, the value targets are updated in a soft manner, using an Exponentially Weighted Moving Average, based on the total discounted returns of each state in each episode.  Run several training iterations of the policy network. This is done by using the previously calculated advantages as targets. The loss function penalizes policies that deviate too far from the old policy (the policy that was used  before  starting to run the current set of training iterations) using a regularization term.   After training is done, the last sampled KL divergence value will be compared with the  target KL divergence  value, in order to adapt the penalty coefficient used in the policy loss. If the KL divergence went too high, increase the penalty, if it went too low, reduce it. Otherwise, leave it unchanged.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/policy_optimization/cppo/",
+            "text": "Clipped Proximal Policy Optimization\n\n\nActions space:\n Discrete|Continuous\n\n\nReferences:\n \nProximal Policy Optimization Algorithms\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action - Continuous action\n\n\nSame as in PPO. \n\n\nTraining the network\n\n\nVery similar to PPO, with several small (but very simplifying) changes:\n\n\n\n\n\n\nTrain both the value and policy networks, simultaneously, by defining a single loss function, which is the sum of each of the networks loss functions. Then, back propagate gradients only once from this unified loss function.\n\n\n\n\n\n\nThe unified network's optimizer is set to Adam (instead of L-BFGS for the value network as in PPO). \n\n\n\n\n\n\nValue targets are now also calculated based on the GAE advantages. In this method, the \n V \n values are predicted from the critic network, and then added to the GAE based advantages, in order to get a \n Q \n value for each action. Now, since our critic network is predicting a \n V \n value for each state, setting the \n Q \n calculated action-values as a target, will on average serve as a \n V \n state-value target.  \n\n\n\n\n\n\nInstead of adapting the penalizing KL divergence coefficient used in PPO, the likelihood ratio \nr_t(\\theta) =\\frac{\\pi_{\\theta}(a|s)}{\\pi_{\\theta_{old}}(a|s)}\n is clipped, to achieve a similar effect. This is done by defining the policy's loss function to be the minimum between the standard surrogate loss and an epsilon clipped surrogate loss:\n\n\n\n\n\n\n\n\nL^{CLIP}(\\theta)=E_{t}[min(r_t(\\theta)\\cdot \\hat{A}_t, clip(r_t(\\theta), 1-\\epsilon, 1+\\epsilon) \\cdot \\hat{A}_t)]",
+            "title": "Clipped Proximal Policy Optimization"
+        },
+        {
+            "location": "/algorithms/policy_optimization/cppo/#clipped-proximal-policy-optimization",
+            "text": "Actions space:  Discrete|Continuous  References:   Proximal Policy Optimization Algorithms",
+            "title": "Clipped Proximal Policy Optimization"
+        },
+        {
+            "location": "/algorithms/policy_optimization/cppo/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/policy_optimization/cppo/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/policy_optimization/cppo/#choosing-an-action-continuous-action",
+            "text": "Same as in PPO.",
+            "title": "Choosing an action - Continuous action"
+        },
+        {
+            "location": "/algorithms/policy_optimization/cppo/#training-the-network",
+            "text": "Very similar to PPO, with several small (but very simplifying) changes:    Train both the value and policy networks, simultaneously, by defining a single loss function, which is the sum of each of the networks loss functions. Then, back propagate gradients only once from this unified loss function.    The unified network's optimizer is set to Adam (instead of L-BFGS for the value network as in PPO).     Value targets are now also calculated based on the GAE advantages. In this method, the   V   values are predicted from the critic network, and then added to the GAE based advantages, in order to get a   Q   value for each action. Now, since our critic network is predicting a   V   value for each state, setting the   Q   calculated action-values as a target, will on average serve as a   V   state-value target.      Instead of adapting the penalizing KL divergence coefficient used in PPO, the likelihood ratio  r_t(\\theta) =\\frac{\\pi_{\\theta}(a|s)}{\\pi_{\\theta_{old}}(a|s)}  is clipped, to achieve a similar effect. This is done by defining the policy's loss function to be the minimum between the standard surrogate loss and an epsilon clipped surrogate loss:     L^{CLIP}(\\theta)=E_{t}[min(r_t(\\theta)\\cdot \\hat{A}_t, clip(r_t(\\theta), 1-\\epsilon, 1+\\epsilon) \\cdot \\hat{A}_t)]",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/other/dfp/",
+            "text": "Direct Future Prediction\n\n\nActions space:\n Discrete\n\n\nReferences:\n \nLearning to Act by Predicting the Future\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nChoosing an action\n\n\n\n\nThe current states (observations and measurements) and the corresponding goal vector are passed as an input to the network. The output of the network is the predicted future measurements for time-steps \nt+1,t+2,t+4,t+8,t+16\n and \nt+32\n for each possible action. \n\n\nFor each action, the measurements of each predicted time-step are multiplied by the goal vector, and the result is a single vector of future values for each action. \n\n\nThen, a weighted sum of the future values of each action is calculated, and the result is a single value for each action. \n\n\nThe action values are passed to the exploration policy to decide on the action to use.\n\n\n\n\nTraining the network\n\n\nGiven a batch of transitions, run them through the network to get the current predictions of the future measurements per action, and set them as the initial targets for training the network. For each transition \n(s_t,a_t,r_t,s_{t+1} )\n in the batch, the target of the network for the action that was taken, is the actual measurements that were seen in time-steps \nt+1,t+2,t+4,t+8,t+16\n and \nt+32\n. For the actions that were not taken, the targets are the current values.",
+            "title": "Direct Future Prediction"
+        },
+        {
+            "location": "/algorithms/other/dfp/#direct-future-prediction",
+            "text": "Actions space:  Discrete  References:   Learning to Act by Predicting the Future",
+            "title": "Direct Future Prediction"
+        },
+        {
+            "location": "/algorithms/other/dfp/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/other/dfp/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/other/dfp/#choosing-an-action",
+            "text": "The current states (observations and measurements) and the corresponding goal vector are passed as an input to the network. The output of the network is the predicted future measurements for time-steps  t+1,t+2,t+4,t+8,t+16  and  t+32  for each possible action.   For each action, the measurements of each predicted time-step are multiplied by the goal vector, and the result is a single vector of future values for each action.   Then, a weighted sum of the future values of each action is calculated, and the result is a single value for each action.   The action values are passed to the exploration policy to decide on the action to use.",
+            "title": "Choosing an action"
+        },
+        {
+            "location": "/algorithms/other/dfp/#training-the-network",
+            "text": "Given a batch of transitions, run them through the network to get the current predictions of the future measurements per action, and set them as the initial targets for training the network. For each transition  (s_t,a_t,r_t,s_{t+1} )  in the batch, the target of the network for the action that was taken, is the actual measurements that were seen in time-steps  t+1,t+2,t+4,t+8,t+16  and  t+32 . For the actions that were not taken, the targets are the current values.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/algorithms/imitation/bc/",
+            "text": "Behavioral Cloning\n\n\nActions space:\n Discrete|Continuous\n\n\nNetwork Structure\n\n\n\n\n\n\n\n\n\n\n\nAlgorithm Description\n\n\nTraining the network\n\n\nThe replay buffer contains the expert demonstrations for the task.\nThese demonstrations are given as state, action tuples, and with no reward.\nThe training goal is to reduce the difference between the actions predicted by the network and the actions taken by the expert for each state.\n\n\n\n\nSample a batch of transitions from the replay buffer.\n\n\nUse the current states as input to the network, and the expert actions as the targets of the network.\n\n\nThe loss function for the network is MSE, and therefore we use the Q head to minimize this loss.",
+            "title": "Behavioral Cloning"
+        },
+        {
+            "location": "/algorithms/imitation/bc/#behavioral-cloning",
+            "text": "Actions space:  Discrete|Continuous",
+            "title": "Behavioral Cloning"
+        },
+        {
+            "location": "/algorithms/imitation/bc/#network-structure",
+            "text": "",
+            "title": "Network Structure"
+        },
+        {
+            "location": "/algorithms/imitation/bc/#algorithm-description",
+            "text": "",
+            "title": "Algorithm Description"
+        },
+        {
+            "location": "/algorithms/imitation/bc/#training-the-network",
+            "text": "The replay buffer contains the expert demonstrations for the task.\nThese demonstrations are given as state, action tuples, and with no reward.\nThe training goal is to reduce the difference between the actions predicted by the network and the actions taken by the expert for each state.   Sample a batch of transitions from the replay buffer.  Use the current states as input to the network, and the expert actions as the targets of the network.  The loss function for the network is MSE, and therefore we use the Q head to minimize this loss.",
+            "title": "Training the network"
+        },
+        {
+            "location": "/dashboard/",
+            "text": "Reinforcement learning algorithms are neat. That is - when they work. But when they don't, RL algorithms are often quite tricky to debug. \n\n\nFinding the root cause for why things break in RL is rather difficult. Moreover, different RL algorithms shine in some aspects, but then lack on other. Comparing the algorithms faithfully is also a hard task, which requires the right tools.\n\n\nCoach Dashboard is a visualization tool which simplifies the analysis of the training process. Each run of Coach extracts a lot of information from within the algorithm and stores it in the experiment directory. This information is very valuable for debugging, analyzing and comparing different algorithms. But without a good visualization tool, this information can not be utilized. This is where Coach Dashboard takes place.\n\n\nVisualizing Signals\n\n\nCoach Dashboard exposes a convenient user interface for visualizing the training signals. The signals are dynamically updated - during the agent training. Additionaly, it allows selecting a subset of the available signals, and then overlaying them on top of each other.  \n\n\n\n\n\n\n\n\n\n\n\n\n\nHolding the CTRL key, while selecting signals, will allow visualizing more than one signal. \n\n\nSignals can be visualized, using either of the Y-axes, in order to visualize signals with different scales. To move a signal to the second Y-axis, select it and press the 'Toggle Second Axis' button.\n\n\n\n\nTracking Statistics\n\n\nWhen running parallel algorithms, such as A3C, it often helps visualizing the learning of all the workers, at the same time. Coach Dashboard allows viewing multiple signals (and even smooth them out, if required) from multiple workers. In addition, it supports viewing the mean and standard deviation of the same signal, across different workers, using Bollinger bands.  \n\n\n\n\n\n\n\n\n\n    \n\n    \nDisplaying Bollinger Bands\n\n\n\n\n\n    \n\n    \nDisplaying All The Workers\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nComparing Runs\n\n\nReinforcement learning algorithms are notoriously known as unstable, and suffer from high run-to-run variance. This makes benchmarking and comparing different algorithms even harder. To ease this process, it is common to execute several runs of the same algorithm and average over them. This is easy to do with Coach Dashboard, by centralizing all the experiment directories in a single directory, and then loading them as a single group. Loading several groups of different algorithms then allows comparing the averaged signals, such as the total episode reward.  \n\n\nIn RL, there are several interesting performance metrics to consider, and this is easy to do by controlling the X-axis units in Coach Dashboard. It is possible to switch between several options such as the total number of steps or the total training time.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nComparing Several Algorithms According to the Time Passed\n\n\n\n\n\n\n\n\n\n\n\n\n\nComparing Several Algorithms According to the Number of Episodes Played",
+            "title": "Coach Dashboard"
+        },
+        {
+            "location": "/dashboard/#visualizing-signals",
+            "text": "Coach Dashboard exposes a convenient user interface for visualizing the training signals. The signals are dynamically updated - during the agent training. Additionaly, it allows selecting a subset of the available signals, and then overlaying them on top of each other.        Holding the CTRL key, while selecting signals, will allow visualizing more than one signal.   Signals can be visualized, using either of the Y-axes, in order to visualize signals with different scales. To move a signal to the second Y-axis, select it and press the 'Toggle Second Axis' button.",
+            "title": "Visualizing Signals"
+        },
+        {
+            "location": "/dashboard/#tracking-statistics",
+            "text": "When running parallel algorithms, such as A3C, it often helps visualizing the learning of all the workers, at the same time. Coach Dashboard allows viewing multiple signals (and even smooth them out, if required) from multiple workers. In addition, it supports viewing the mean and standard deviation of the same signal, across different workers, using Bollinger bands.       \n     \n     Displaying Bollinger Bands   \n     \n     Displaying All The Workers",
+            "title": "Tracking Statistics"
+        },
+        {
+            "location": "/dashboard/#comparing-runs",
+            "text": "Reinforcement learning algorithms are notoriously known as unstable, and suffer from high run-to-run variance. This makes benchmarking and comparing different algorithms even harder. To ease this process, it is common to execute several runs of the same algorithm and average over them. This is easy to do with Coach Dashboard, by centralizing all the experiment directories in a single directory, and then loading them as a single group. Loading several groups of different algorithms then allows comparing the averaged signals, such as the total episode reward.    In RL, there are several interesting performance metrics to consider, and this is easy to do by controlling the X-axis units in Coach Dashboard. It is possible to switch between several options such as the total number of steps or the total training time.       Comparing Several Algorithms According to the Time Passed     Comparing Several Algorithms According to the Number of Episodes Played",
+            "title": "Comparing Runs"
+        },
+        {
+            "location": "/contributing/add_agent/",
+            "text": "Coach's modularity makes adding an agent a simple and clean task, that involves the following steps:\n\n\n\n\n\n\nImplement your algorithm in a new file. The agent can inherit base classes such as \nValueOptimizationAgent\n or\n    \nActorCriticAgent\n, or the more generic \nAgent\n base class.\n\n\n\n\nValueOptimizationAgent\n, \nPolicyOptimizationAgent\n and \nAgent\n are abstract classes. \nlearn_from_batch() should be overriden with the desired behavior for the algorithm being implemented.\nIf deciding to inherit from \nAgent\n, also choose_action() should be overriden.\ndef learn_from_batch(self, batch) -> Tuple[float, List, List]:\n    \"\"\"\n    Given a batch of transitions, calculates their target values and updates the network.\n    :param batch: A list of transitions\n    :return: The total loss of the training, the loss per head and the unclipped gradients\n    \"\"\"\n\ndef choose_action(self, curr_state):\n    \"\"\"\n    choose an action to act with in the current episode being played. Different behavior might be exhibited when training\n     or testing.\n\n    :param curr_state: the current state to act upon.\n    :return: chosen action, some action value describing the action (q-value, probability, etc)\n    \"\"\"\n\n\n\n\n\n\n\n\n\n\n\nImplement your agent's specific network head, if needed, at the implementation for the framework of your choice.\n    For example \narchitectures/neon_components/heads.py\n. The head will inherit the generic base class Head.\n    A new output type should be added to configurations.py, and a mapping between the new head and output type should\n    be defined in the get_output_head() function at \narchitectures/neon_components/general_network.py\n\n\n\n\n\n\nDefine a new parameters class that inherits AgentParameters.\n    The parameters class defines all the hyperparameters for the agent, and is initialized with 4 main components:\n\n\n\n\nalgorithm\n: A class inheriting AlgorithmParameters which defines any algorithm specific parameters\n\n\nexploration\n: A class inheriting ExplorationParameters which defines the exploration policy parameters.\n               There are several common exploration policies built-in which you can use, and are defined under\n               the exploration sub directory. You can also define your own custom exploration policy.\n\n\nmemory\n: A class inheriting MemoryParameters which defined the memory parameters.\n          There are several common memory types built-in which you can use, and are defined under the memories\n          sub directory. You can also define your own custom memory.\n\n\nnetworks\n: A dictionary defining all the networks that will be used by the agent. The keys of the dictionary\n            define the network name and will be used to access each network through the agent class.\n            The dictionary values are a class inheriting NetworkParameters, which define the network structure\n            and parameters.\n\n\n\n\nAdditionally, set the path property to return the path to your agent class in the following format:\n\n\n    <path to python module>:<name of agent class>\n\n\n\nFor example,\n\n\n    class RainbowAgentParameters(AgentParameters):\n    def __init__(self):\n        super().__init__(algorithm=RainbowAlgorithmParameters(),\n                         exploration=RainbowExplorationParameters(),\n                         memory=RainbowMemoryParameters(),\n                         networks={\"main\": RainbowNetworkParameters()})\n\n    @property\n    def path(self):\n        return 'rainbow.rainbow_agent:RainbowAgent'\n\n\n\n\n\n\n\n(Optional) Define a preset using the new agent type with a given environment, and the hyper-parameters that should\n    be used for training on that environment.",
+            "title": "Adding a New Agent"
+        },
+        {
+            "location": "/contributing/add_env/",
+            "text": "Adding a new environment to Coach is as easy as solving CartPole. \n\n\nThere are essentially two ways to integrate new environments to Coach:\n\n\nUsing the OpenAI Gym API\n\n\nIf your environment is already using the OpenAI Gym API, you are already good to go.\nWhen selecting the environment parameters in the preset, use GymEnvironmentParameters(),\nand pass the path to your environment source code using the level parameter.\nYou can specify additional parameters for your environment using the additional_simulator_parameters parameter.\nTake for example the definition used in the Pendulum_HAC preset:\n\n\n    env_params = GymEnvironmentParameters()\n    env_params.level = \"rl_coach.environments.mujoco.pendulum_with_goals:PendulumWithGoals\"\n    env_params.additional_simulator_parameters = {\"time_limit\": 1000}\n\n\n\nUsing the Coach API\n\n\nThere are a few simple steps to follow, and we will walk through them one by one.\n\n\n\n\n\n\nCreate a new class for your environment, and inherit the Environment class.\n\n\n\n\n\n\nCoach defines a simple API for implementing a new environment, which are defined in environment/environment.py.\n    There are several functions to implement, but only some of them are mandatory.\n\n\nHere are the important ones:\n\n\n    def _take_action(self, action_idx: ActionType) -> None:\n        \"\"\"\n        An environment dependent function that sends an action to the simulator.\n        :param action_idx: the action to perform on the environment\n        :return: None\n        \"\"\"\n\n    def _update_state(self) -> None:\n        \"\"\"\n        Updates the state from the environment.\n        Should update self.observation, self.reward, self.done, self.measurements and self.info\n        :return: None\n        \"\"\"\n\n    def _restart_environment_episode(self, force_environment_reset=False) -> None:\n        \"\"\"\n        Restarts the simulator episode\n        :param force_environment_reset: Force the environment to reset even if the episode is not done yet.\n        :return: None\n        \"\"\"\n\n    def _render(self) -> None:\n        \"\"\"\n        Renders the environment using the native simulator renderer\n        :return: None\n        \"\"\"\n\n    def get_rendered_image(self) -> np.ndarray:\n        \"\"\"\n        Return a numpy array containing the image that will be rendered to the screen.\n        This can be different from the observation. For example, mujoco's observation is a measurements vector.\n        :return: numpy array containing the image that will be rendered to the screen\n        \"\"\"\n\n\n\n\n\n\n\nCreate a new parameters class for your environment, which inherits the EnvironmentParameters class.\n    In the \ninit\n of your class, define all the parameters you used in your Environment class.\n    Additionally, fill the path property of the class with the path to your Environment class.\n    For example, take a look at the EnvironmentParameters class used for Doom:\n\n\n    class DoomEnvironmentParameters(EnvironmentParameters):\n    def __init__(self):\n        super().__init__()\n        self.default_input_filter = DoomInputFilter\n        self.default_output_filter = DoomOutputFilter\n        self.cameras = [DoomEnvironment.CameraTypes.OBSERVATION]\n\n    @property\n    def path(self):\n        return 'rl_coach.environments.doom_environment:DoomEnvironment'\n\n\n\n\n\n\n\nAnd that's it, you're done. Now just add a new preset with your newly created environment, and start training an agent on top of it.",
+            "title": "Adding a New Environment"
+        },
+        {
+            "location": "/contributing/add_env/#using-the-openai-gym-api",
+            "text": "If your environment is already using the OpenAI Gym API, you are already good to go.\nWhen selecting the environment parameters in the preset, use GymEnvironmentParameters(),\nand pass the path to your environment source code using the level parameter.\nYou can specify additional parameters for your environment using the additional_simulator_parameters parameter.\nTake for example the definition used in the Pendulum_HAC preset:      env_params = GymEnvironmentParameters()\n    env_params.level = \"rl_coach.environments.mujoco.pendulum_with_goals:PendulumWithGoals\"\n    env_params.additional_simulator_parameters = {\"time_limit\": 1000}",
+            "title": "Using the OpenAI Gym API"
+        },
+        {
+            "location": "/contributing/add_env/#using-the-coach-api",
+            "text": "There are a few simple steps to follow, and we will walk through them one by one.    Create a new class for your environment, and inherit the Environment class.    Coach defines a simple API for implementing a new environment, which are defined in environment/environment.py.\n    There are several functions to implement, but only some of them are mandatory.  Here are the important ones:      def _take_action(self, action_idx: ActionType) -> None:\n        \"\"\"\n        An environment dependent function that sends an action to the simulator.\n        :param action_idx: the action to perform on the environment\n        :return: None\n        \"\"\"\n\n    def _update_state(self) -> None:\n        \"\"\"\n        Updates the state from the environment.\n        Should update self.observation, self.reward, self.done, self.measurements and self.info\n        :return: None\n        \"\"\"\n\n    def _restart_environment_episode(self, force_environment_reset=False) -> None:\n        \"\"\"\n        Restarts the simulator episode\n        :param force_environment_reset: Force the environment to reset even if the episode is not done yet.\n        :return: None\n        \"\"\"\n\n    def _render(self) -> None:\n        \"\"\"\n        Renders the environment using the native simulator renderer\n        :return: None\n        \"\"\"\n\n    def get_rendered_image(self) -> np.ndarray:\n        \"\"\"\n        Return a numpy array containing the image that will be rendered to the screen.\n        This can be different from the observation. For example, mujoco's observation is a measurements vector.\n        :return: numpy array containing the image that will be rendered to the screen\n        \"\"\"    Create a new parameters class for your environment, which inherits the EnvironmentParameters class.\n    In the  init  of your class, define all the parameters you used in your Environment class.\n    Additionally, fill the path property of the class with the path to your Environment class.\n    For example, take a look at the EnvironmentParameters class used for Doom:      class DoomEnvironmentParameters(EnvironmentParameters):\n    def __init__(self):\n        super().__init__()\n        self.default_input_filter = DoomInputFilter\n        self.default_output_filter = DoomOutputFilter\n        self.cameras = [DoomEnvironment.CameraTypes.OBSERVATION]\n\n    @property\n    def path(self):\n        return 'rl_coach.environments.doom_environment:DoomEnvironment'    And that's it, you're done. Now just add a new preset with your newly created environment, and start training an agent on top of it.",
+            "title": "Using the Coach API"
+        }
+    ]
+}
\ No newline at end of file
diff --git a/docs/mkdocs/js/text.js b/docs/search/text.js
similarity index 100%
rename from docs/mkdocs/js/text.js
rename to docs/search/text.js
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index 4d349e6..4f9fad5 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -1,158 +1,133 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
-
-    
     <url>
-     <loc>None/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-    
-
-    
     <url>
-     <loc>None/design/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/usage/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-    
-
-    
     <url>
-     <loc>None/usage/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/design/features/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-    
-
-    
-        
     <url>
-     <loc>None/algorithms/value_optimization/dqn/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/design/control_flow/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/value_optimization/double_dqn/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/design/network/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/value_optimization/dueling_dqn/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/design/filters/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/value_optimization/categorical_dqn/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/dqn/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/value_optimization/mmc/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/double_dqn/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/value_optimization/pal/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/dueling_dqn/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/value_optimization/nec/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/categorical_dqn/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/value_optimization/bs_dqn/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/mmc/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/value_optimization/n_step/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/pal/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/value_optimization/naf/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/nec/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/policy_optimization/pg/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/bs_dqn/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/policy_optimization/ac/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/n_step/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/policy_optimization/ddpg/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/value_optimization/naf/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/policy_optimization/ppo/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/policy_optimization/pg/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/policy_optimization/cppo/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/policy_optimization/ac/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/other/dfp/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/policy_optimization/ddpg/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/algorithms/imitation/bc/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/policy_optimization/ppo/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
-    
-
-    
     <url>
-     <loc>None/dashboard/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/policy_optimization/cppo/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-    
-
-    
-        
     <url>
-     <loc>None/contributing/add_agent/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/other/dfp/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
     <url>
-     <loc>None/contributing/add_env/</loc>
-     <lastmod>2017-12-18</lastmod>
+     <loc>/algorithms/imitation/bc/</loc>
+     <lastmod>2018-08-09</lastmod>
+     <changefreq>daily</changefreq>
+    </url>
+    <url>
+     <loc>/dashboard/</loc>
+     <lastmod>2018-08-09</lastmod>
+     <changefreq>daily</changefreq>
+    </url>
+    <url>
+     <loc>/contributing/add_agent/</loc>
+     <lastmod>2018-08-09</lastmod>
+     <changefreq>daily</changefreq>
+    </url>
+    <url>
+     <loc>/contributing/add_env/</loc>
+     <lastmod>2018-08-09</lastmod>
      <changefreq>daily</changefreq>
     </url>
-        
-    
-
 </urlset>
\ No newline at end of file
diff --git a/docs/usage/index.html b/docs/usage/index.html
index aba447d..a0af9df 100644
--- a/docs/usage/index.html
+++ b/docs/usage/index.html
@@ -3,33 +3,29 @@
 <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
 <head>
   <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   
-  <title>Usage - Reinforcement Learning Coach Documentation</title>
   
-
   <link rel="shortcut icon" href="../img/favicon.ico">
-
-  
+  <title>Usage - Reinforcement Learning Coach</title>
   <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'>
 
   <link rel="stylesheet" href="../css/theme.css" type="text/css" />
   <link rel="stylesheet" href="../css/theme_extra.css" type="text/css" />
   <link rel="stylesheet" href="../css/highlight.css">
   <link href="../extra.css" rel="stylesheet">
-
   
   <script>
     // Current page data
     var mkdocs_page_name = "Usage";
+    var mkdocs_page_input_path = "usage.md";
+    var mkdocs_page_url = "/usage/";
   </script>
   
   <script src="../js/jquery-2.1.1.min.js"></script>
   <script src="../js/modernizr-2.8.3.min.js"></script>
-  <script type="text/javascript" src="../js/highlight.pack.js"></script>
-  <script src="../js/theme.js"></script> 
-  <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
-
+  <script type="text/javascript" src="../js/highlight.pack.js"></script> 
   
 </head>
 
@@ -40,7 +36,7 @@
     
     <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
       <div class="wy-side-nav-search">
-        <a href="../index.html" class="icon icon-home"> Reinforcement Learning Coach Documentation</a>
+        <a href=".." class="icon icon-home"> Reinforcement Learning Coach</a>
         <div role="search">
   <form id ="rtd-search-form" class="wy-form" action="../search.html" method="get">
     <input type="text" name="q" placeholder="Search docs" />
@@ -49,205 +45,160 @@
       </div>
 
       <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
-        <ul class="current">
+	<ul class="current">
+	  
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../index.html">Home</a>
-        
-    </li>
-<li>
+            <li class="toctree-l1">
+		
+    <a class="" href="..">Home</a>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../design/index.html">Design</a>
-        
-    </li>
-<li>
-          
-            <li>
-    <li class="toctree-l1 current">
-        <a class="current" href="./index.html">Usage</a>
-        
-            <ul>
-            
-                <li class="toctree-l3"><a href="#coach-usage">Coach Usage</a></li>
-                
-                    <li><a class="toctree-l4" href="#training-an-agent">Training an Agent</a></li>
-                
-                    <li><a class="toctree-l4" href="#evaluating-an-agent">Evaluating an Agent</a></li>
-                
-                    <li><a class="toctree-l4" href="#playing-with-the-environment-as-a-human">Playing with the Environment as a Human</a></li>
-                
-                    <li><a class="toctree-l4" href="#learning-through-imitation-learning">Learning Through Imitation Learning</a></li>
-                
-                    <li><a class="toctree-l4" href="#visualizations">Visualizations</a></li>
-                
-                    <li><a class="toctree-l4" href="#switching-between-deep-learning-frameworks">Switching between deep learning frameworks</a></li>
-                
-                    <li><a class="toctree-l4" href="#additional-flags">Additional Flags</a></li>
-                
-            
-            </ul>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1 current">
+		
+    <a class="current" href="./">Usage</a>
     <ul class="subnav">
-    <li><span>Algorithms</span></li>
-
-        
             
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/dqn/index.html">DQN</a>
+    <li class="toctree-l2"><a href="#coach-usage">Coach Usage</a></li>
+    
+        <ul>
         
-    </li>
+            <li><a class="toctree-l3" href="#training-an-agent">Training an Agent</a></li>
+        
+            <li><a class="toctree-l3" href="#evaluating-an-agent">Evaluating an Agent</a></li>
+        
+            <li><a class="toctree-l3" href="#playing-with-the-environment-as-a-human">Playing with the Environment as a Human</a></li>
+        
+            <li><a class="toctree-l3" href="#learning-through-imitation-learning">Learning Through Imitation Learning</a></li>
+        
+            <li><a class="toctree-l3" href="#visualizations">Visualizations</a></li>
+        
+            <li><a class="toctree-l3" href="#switching-between-deep-learning-frameworks">Switching between deep learning frameworks</a></li>
+        
+            <li><a class="toctree-l3" href="#additional-flags">Additional Flags</a></li>
+        
+        </ul>
+    
 
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/double_dqn/index.html">Double DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/dueling_dqn/index.html">Dueling DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/categorical_dqn/index.html">Categorical DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/mmc/index.html">Mixed Monte Carlo</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/pal/index.html">Persistent Advantage Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/nec/index.html">Neural Episodic Control</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/bs_dqn/index.html">Bootstrapped DQN</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/n_step/index.html">N-Step Q Learning</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/value_optimization/naf/index.html">Normalized Advantage Functions</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/pg/index.html">Policy Gradient</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/ac/index.html">Actor-Critic</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/ddpg/index.html">Deep Determinstic Policy Gradients</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/ppo/index.html">Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/policy_optimization/cppo/index.html">Clipped Proximal Policy Optimization</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/other/dfp/index.html">Direct Future Prediction</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../algorithms/imitation/bc/index.html">Behavioral Cloning</a>
-        
-    </li>
-
-        
     </ul>
-<li>
+	    </li>
           
-            <li>
-    <li class="toctree-l1 ">
-        <a class="" href="../dashboard/index.html">Coach Dashboard</a>
-        
-    </li>
-<li>
-          
-            <li>
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Design</span>
     <ul class="subnav">
-    <li><span>Contributing</span></li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../contributing/add_agent/index.html">Adding a New Agent</a>
-        
-    </li>
-
-        
-            
-    <li class="toctree-l1 ">
-        <a class="" href="../contributing/add_env/index.html">Adding a New Environment</a>
-        
-    </li>
-
-        
+                <li class="">
+                    
+    <a class="" href="../design/features/">Features</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../design/control_flow/">Control Flow</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../design/network/">Network</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../design/filters/">Filters</a>
+                </li>
     </ul>
-<li>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Algorithms</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/dqn/">DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/double_dqn/">Double DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/dueling_dqn/">Dueling DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/categorical_dqn/">Categorical DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/mmc/">Mixed Monte Carlo</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/pal/">Persistent Advantage Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/nec/">Neural Episodic Control</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/bs_dqn/">Bootstrapped DQN</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/n_step/">N-Step Q Learning</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/value_optimization/naf/">Normalized Advantage Functions</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/pg/">Policy Gradient</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/ac/">Actor-Critic</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/ddpg/">Deep Determinstic Policy Gradients</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/ppo/">Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/policy_optimization/cppo/">Clipped Proximal Policy Optimization</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/other/dfp/">Direct Future Prediction</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../algorithms/imitation/bc/">Behavioral Cloning</a>
+                </li>
+    </ul>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <a class="" href="../dashboard/">Coach Dashboard</a>
+	    </li>
+          
+            <li class="toctree-l1">
+		
+    <span class="caption-text">Contributing</span>
+    <ul class="subnav">
+                <li class="">
+                    
+    <a class="" href="../contributing/add_agent/">Adding a New Agent</a>
+                </li>
+                <li class="">
+                    
+    <a class="" href="../contributing/add_env/">Adding a New Environment</a>
+                </li>
+    </ul>
+	    </li>
           
         </ul>
       </div>
@@ -259,7 +210,7 @@
       
       <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
         <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
-        <a href="../index.html">Reinforcement Learning Coach Documentation</a>
+        <a href="..">Reinforcement Learning Coach</a>
       </nav>
 
       
@@ -267,7 +218,7 @@
         <div class="rst-content">
           <div role="navigation" aria-label="breadcrumbs navigation">
   <ul class="wy-breadcrumbs">
-    <li><a href="../index.html">Docs</a> &raquo;</li>
+    <li><a href="..">Docs</a> &raquo;</li>
     
       
     
@@ -463,10 +414,10 @@ The most up to date description can be found by using the <code>-h</code> flag.<
   
     <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
       
-        <a href="../algorithms/value_optimization/dqn/index.html" class="btn btn-neutral float-right" title="DQN"/>Next <span class="icon icon-circle-arrow-right"></span></a>
+        <a href="../design/features/" class="btn btn-neutral float-right" title="Features">Next <span class="icon icon-circle-arrow-right"></span></a>
       
       
-        <a href="../design/index.html" class="btn btn-neutral" title="Design"><span class="icon icon-circle-arrow-left"></span> Previous</a>
+        <a href=".." class="btn btn-neutral" title="Home"><span class="icon icon-circle-arrow-left"></span> Previous</a>
       
     </div>
   
@@ -480,7 +431,7 @@ The most up to date description can be found by using the <code>-h</code> flag.<
 
   Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
 </footer>
-	  
+      
         </div>
       </div>
 
@@ -488,17 +439,22 @@ The most up to date description can be found by using the <code>-h</code> flag.<
 
   </div>
 
-<div class="rst-versions" role="note" style="cursor: pointer">
+  <div class="rst-versions" role="note" style="cursor: pointer">
     <span class="rst-current-version" data-toggle="rst-current-version">
       
       
-        <span><a href="../design/index.html" style="color: #fcfcfc;">&laquo; Previous</a></span>
+        <span><a href=".." style="color: #fcfcfc;">&laquo; Previous</a></span>
       
       
-        <span style="margin-left: 15px"><a href="../algorithms/value_optimization/dqn/index.html" style="color: #fcfcfc">Next &raquo;</a></span>
+        <span style="margin-left: 15px"><a href="../design/features/" style="color: #fcfcfc">Next &raquo;</a></span>
       
     </span>
 </div>
+    <script>var base_url = '..';</script>
+    <script src="../js/theme.js"></script>
+      <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
+      <script src="../search/require.js"></script>
+      <script src="../search/search.js"></script>
 
 </body>
 </html>
diff --git a/architectures/tensorflow_components/__init__.py b/docs_raw/__init__.py
similarity index 100%
rename from architectures/tensorflow_components/__init__.py
rename to docs_raw/__init__.py
diff --git a/dashboard_components/__init__.py b/docs_raw/docs/__init__.py
similarity index 100%
rename from dashboard_components/__init__.py
rename to docs_raw/docs/__init__.py
diff --git a/docs_raw/docs/contributing/add_agent.md b/docs_raw/docs/contributing/add_agent.md
index 284a2bd..6ba8d17 100644
--- a/docs_raw/docs/contributing/add_agent.md
+++ b/docs_raw/docs/contributing/add_agent.md
@@ -2,37 +2,67 @@
 
 Coach's modularity makes adding an agent a simple and clean task, that involves the following steps:
 
-1.	Implement your algorithm in a new file under the agents directory. The agent can inherit base classes such as **ValueOptimizationAgent** or **ActorCriticAgent**, or the more generic **Agent** base class.
+1.	Implement your algorithm in a new file. The agent can inherit base classes such as **ValueOptimizationAgent** or
+    **ActorCriticAgent**, or the more generic **Agent** base class.
     
     * **ValueOptimizationAgent**, **PolicyOptimizationAgent** and **Agent** are abstract classes. 
-    learn_from_batch() should be overriden with the desired behavior for the algorithm being implemented. If deciding to inherit from **Agent**, also choose_action() should be overriden.       
+    learn_from_batch() should be overriden with the desired behavior for the algorithm being implemented.
+    If deciding to inherit from **Agent**, also choose_action() should be overriden.
         
     
-            def learn_from_batch(self, batch):
+            def learn_from_batch(self, batch) -> Tuple[float, List, List]:
                 """
                 Given a batch of transitions, calculates their target values and updates the network.
                 :param batch: A list of transitions
-                :return: The loss of the training
+                :return: The total loss of the training, the loss per head and the unclipped gradients
                 """
-                pass
-                
-            def choose_action(self, curr_state, phase=RunPhase.TRAIN):
+
+            def choose_action(self, curr_state):
                 """
                 choose an action to act with in the current episode being played. Different behavior might be exhibited when training
                  or testing.
-                 
-                :param curr_state: the current state to act upon.  
-                :param phase: the current phase: training or testing.
+
+                :param curr_state: the current state to act upon.
                 :return: chosen action, some action value describing the action (q-value, probability, etc)
                 """
-                pass
-                
-            
-       
-    * Make sure to add your new agent to **agents/\_\_init\_\_.py**
-    
-2.	Implement your agent's specific network head, if needed, at the implementation for the framework of your choice. For example **architectures/neon_components/heads.py**. The head will inherit the generic base class Head.
-    A new output type should be added to configurations.py, and a mapping between the new head and output type should be defined in the get_output_head() function at **architectures/neon_components/general_network.py**
-3.	Define a new configuration class at configurations.py, which includes the new agent name in the **type** field, the new output type in the **output_types** field, and assigning default values to hyperparameters.
-4.	(Optional) Define a preset using the new agent type with a given environment, and the hyperparameters that should be used for training on that environment.
+
+2.	Implement your agent's specific network head, if needed, at the implementation for the framework of your choice.
+    For example **architectures/neon_components/heads.py**. The head will inherit the generic base class Head.
+    A new output type should be added to configurations.py, and a mapping between the new head and output type should
+    be defined in the get_output_head() function at **architectures/neon_components/general_network.py**
+
+3.	Define a new parameters class that inherits AgentParameters.
+    The parameters class defines all the hyperparameters for the agent, and is initialized with 4 main components:
+    * **algorithm**: A class inheriting AlgorithmParameters which defines any algorithm specific parameters
+    * **exploration**: A class inheriting ExplorationParameters which defines the exploration policy parameters.
+                   There are several common exploration policies built-in which you can use, and are defined under
+                   the exploration sub directory. You can also define your own custom exploration policy.
+    * **memory**: A class inheriting MemoryParameters which defined the memory parameters.
+              There are several common memory types built-in which you can use, and are defined under the memories
+              sub directory. You can also define your own custom memory.
+    * **networks**: A dictionary defining all the networks that will be used by the agent. The keys of the dictionary
+                define the network name and will be used to access each network through the agent class.
+                The dictionary values are a class inheriting NetworkParameters, which define the network structure
+                and parameters.
+
+
+    Additionally, set the path property to return the path to your agent class in the following format:
+
+            <path to python module>:<name of agent class>
+
+    For example,
+
+            class RainbowAgentParameters(AgentParameters):
+            def __init__(self):
+                super().__init__(algorithm=RainbowAlgorithmParameters(),
+                                 exploration=RainbowExplorationParameters(),
+                                 memory=RainbowMemoryParameters(),
+                                 networks={"main": RainbowNetworkParameters()})
+
+            @property
+            def path(self):
+                return 'rainbow.rainbow_agent:RainbowAgent'
+
+4.	(Optional) Define a preset using the new agent type with a given environment, and the hyper-parameters that should
+    be used for training on that environment.
 
diff --git a/docs_raw/docs/contributing/add_env.md b/docs_raw/docs/contributing/add_env.md
index f09b4db..25a7f2c 100644
--- a/docs_raw/docs/contributing/add_env.md
+++ b/docs_raw/docs/contributing/add_env.md
@@ -1,70 +1,79 @@
 Adding a new environment to Coach is as easy as solving CartPole. 
 
+There are essentially two ways to integrate new environments to Coach:
+
+## Using the OpenAI Gym API
+
+If your environment is already using the OpenAI Gym API, you are already good to go.
+When selecting the environment parameters in the preset, use GymEnvironmentParameters(),
+and pass the path to your environment source code using the level parameter.
+You can specify additional parameters for your environment using the additional_simulator_parameters parameter.
+Take for example the definition used in the Pendulum_HAC preset:
+
+        env_params = GymEnvironmentParameters()
+        env_params.level = "rl_coach.environments.mujoco.pendulum_with_goals:PendulumWithGoals"
+        env_params.additional_simulator_parameters = {"time_limit": 1000}
+
+## Using the Coach API
+
 There are a few simple steps to follow, and we will walk through them one by one.
 
-1.  Coach defines a simple API for implementing a new environment which is defined in environment/environment_wrapper.py.
-    There are several functions to implement, but only some of them are mandatory. 
+1.  Create a new class for your environment, and inherit the Environment class.
+
+2.  Coach defines a simple API for implementing a new environment, which are defined in environment/environment.py.
+    There are several functions to implement, but only some of them are mandatory.
 
     Here are the important ones:
 
-            def _take_action(self, action_idx):
+            def _take_action(self, action_idx: ActionType) -> None:
                 """
                 An environment dependent function that sends an action to the simulator.
-                :param action_idx: the action to perform on the environment.
+                :param action_idx: the action to perform on the environment
                 :return: None
                 """
-                pass
 
-            def _preprocess_observation(self, observation):
-                """
-                Do initial observation preprocessing such as cropping, rgb2gray, rescale etc.
-                Implementing this function is optional.
-                :param observation: a raw observation from the environment
-                :return: the preprocessed observation
-                """
-                return observation
-
-            def _update_state(self):
+            def _update_state(self) -> None:
                 """
                 Updates the state from the environment.
                 Should update self.observation, self.reward, self.done, self.measurements and self.info
                 :return: None
                 """
-                pass
 
-            def _restart_environment_episode(self, force_environment_reset=False):
+            def _restart_environment_episode(self, force_environment_reset=False) -> None:
                 """
+                Restarts the simulator episode
                 :param force_environment_reset: Force the environment to reset even if the episode is not done yet.
-                :return:
+                :return: None
                 """
-                pass
 
-            def get_rendered_image(self):
+            def _render(self) -> None:
+                """
+                Renders the environment using the native simulator renderer
+                :return: None
+                """
+
+            def get_rendered_image(self) -> np.ndarray:
                 """
                 Return a numpy array containing the image that will be rendered to the screen.
                 This can be different from the observation. For example, mujoco's observation is a measurements vector.
                 :return: numpy array containing the image that will be rendered to the screen
                 """
-                return self.observation
 
+3.  Create a new parameters class for your environment, which inherits the EnvironmentParameters class.
+    In the __init__ of your class, define all the parameters you used in your Environment class.
+    Additionally, fill the path property of the class with the path to your Environment class.
+    For example, take a look at the EnvironmentParameters class used for Doom:
 
-2.  Make sure to import the environment in environments/\_\_init\_\_.py:
-        
-        from doom_environment_wrapper import *
-        
-    Also, a new entry should be added to the EnvTypes enum mapping the environment name to the wrapper's class name:
-        
-        Doom = "DoomEnvironmentWrapper"
+            class DoomEnvironmentParameters(EnvironmentParameters):
+            def __init__(self):
+                super().__init__()
+                self.default_input_filter = DoomInputFilter
+                self.default_output_filter = DoomOutputFilter
+                self.cameras = [DoomEnvironment.CameraTypes.OBSERVATION]
+
+            @property
+            def path(self):
+                return 'rl_coach.environments.doom_environment:DoomEnvironment'
     
-                
-3. In addition a new configuration class should be implemented for defining the environment's parameters and placed in configurations.py. 
-For instance, the following is used for Doom:
 
-        class Doom(EnvironmentParameters):
-            type = 'Doom'
-            frame_skip = 4
-            observation_stack_size = 3
-            desired_observation_height = 60
-            desired_observation_width = 76
-            
-4. And that's it, you're done. Now just add a new preset with your newly created environment, and start training an agent on top of it. 
+4.  And that's it, you're done. Now just add a new preset with your newly created environment, and start training an agent on top of it.
diff --git a/docs_raw/docs/design/control_flow.md b/docs_raw/docs/design/control_flow.md
new file mode 100644
index 0000000..b21132f
--- /dev/null
+++ b/docs_raw/docs/design/control_flow.md
@@ -0,0 +1,94 @@
+<!-- language-all: python -->
+
+# Coach Control Flow
+
+Coach is built in a modular way, encouraging modules reuse and reducing the amount of boilerplate code needed
+for developing new algorithms or integrating a new challenge as an environment.
+On the other hand, it can be overwhelming for new users to ramp up on the code.
+To help with that, here's a short overview of the control flow.
+
+## Graph Manager
+
+The main entry point for Coach is **coach.py**.
+The main functionality of this script is to parse the command line arguments and invoke all the sub-processes needed
+for the given experiment.
+**coach.py** executes the given **preset** file which returns a **GraphManager** object.
+
+A **preset** is a design pattern that is intended for concentrating the entire definition of an experiment in a single
+file. This helps with experiments reproducibility, improves readability and prevents confusion.
+The outcome of a preset is a **GraphManager** which will usually be instantiated in the final lines of the preset.
+
+A **GraphManager** is an object that holds all the agents and environments of an experiment, and is mostly responsible
+for scheduling their work. Why is it called a **graph** manager? Because agents and environments are structured into
+a graph of interactions. For example, in hierarchical reinforcement learning schemes, there will often be a master
+policy agent, that will control a sub-policy agent, which will interact with the environment. Other schemes can have
+much more complex graphs of control, such as several hierarchy layers, each with multiple agents.
+The graph manager's main loop is the improve loop.
+
+<p style="text-align: center;">
+
+<img src="../../img/improve.png" alt="Improve loop" style="width: 400px;"/>
+
+</p>
+
+The improve loop skips between 3 main phases - heatup, training and evaluation:
+
+* **Heatup** - the goal of this phase is to collect initial data for populating the replay buffers. The heatup phase
+  takes place only in the beginning of the experiment, and the agents will act completely randomly during this phase.
+  Importantly, the agents do not train their networks during this phase. DQN for example, uses 50k random steps in order
+  to initialize the replay buffers.
+
+* **Training** - the training phase is the main phase of the experiment. This phase can change between agent types,
+  but essentially consists of repeated cycles of acting, collecting data from the environment, and training the agent
+  networks. During this phase, the agent will use its exploration policy in training mode, which will add noise to its
+  actions in order to improve its knowledge about the environment state space.
+
+* **Evaluation** - the evaluation phase is intended for evaluating the current performance of the agent. The agents
+  will act greedily in order to exploit the knowledge aggregated so far and the performance over multiple episodes of
+  evaluation will be averaged in order to reduce the stochasticity effects of all the components.
+
+
+## Level Manager
+
+In each of the 3 phases described above, the graph manager will invoke all the hierarchy levels in the graph in a
+synchronized manner. In Coach, agents do not interact directly with the environment. Instead, they go through a
+*LevelManager*, which is a proxy that manages their interaction. The level manager passes the current state and reward
+from the environment to the agent, and the actions from the agent to the environment.
+
+The motivation for having a level manager is to disentangle the code of the environment and the agent, so to allow more
+complex interactions. Each level can have multiple agents which interact with the environment. Who gets to choose the
+action for each step is controlled by the level manager.
+Additionally, each level manager can act as an environment for the hierarchy level above it, such that each hierarchy
+level can be seen as an interaction between an agent and an environment, even if the environment is just more agents in
+a lower hierarchy level.
+
+
+## Agent
+
+The base agent class has 3 main function that will be used during those phases - observe, act and train.
+
+* **Observe** - this function gets the latest response from the environment as input, and updates the internal state
+  of the agent with the new information. The environment response will
+  be first passed through the agent's **InputFilter** object, which will process the values in the response, according
+  to the specific agent definition. The environment response will then be converted into a
+  **Transition** which will contain the information from a single step
+  ($ s_{t}, a_{t}, r_{t}, s_{t+1}, terminal signal $), and store it in the memory.
+
+<img src="../../img/observe.png" alt="Observe" style="width: 700px;"/>
+
+* **Act** - this function uses the current internal state of the agent in order to select the next action to take on
+  the environment. This function will call the per-agent custom function **choose_action** that will use the network
+  and the exploration policy in order to select an action. The action will be stored, together with any additional
+  information (like the action value for example) in an **ActionInfo** object. The ActionInfo object will then be
+  passed through the agent's **OutputFilter** to allow any processing of the action (like discretization,
+  or shifting, for example), before passing it to the environment.
+
+<img src="../../img/act.png" alt="Act" style="width: 700px;"/>
+
+* **Train** - this function will sample a batch from the memory and train on it. The batch of transitions will be
+  first wrapped into a **Batch** object to allow efficient querying of the batch values. It will then be passed into
+  the agent specific **learn_from_batch** function, that will extract network target values from the batch and will
+  train the networks accordingly. Lastly, if there's a target network defined for the agent, it will sync the target
+  network weights with the online network.
+
+<img src="../../img/train.png" alt="Train" style="width: 700px;"/>
diff --git a/docs_raw/docs/design/features.md b/docs_raw/docs/design/features.md
new file mode 100644
index 0000000..892fc47
--- /dev/null
+++ b/docs_raw/docs/design/features.md
@@ -0,0 +1,44 @@
+# Coach Features
+
+## Supported Algorithms
+
+Coach supports many state-of-the-art reinforcement learning algorithms, which are separated into two main classes -
+value optimization and policy optimization. A detailed description of those algorithms may be found in the algorithms
+section.
+
+<p style="text-align: center;">
+
+<img src="../../img/algorithms.png" alt="Supported Algorithms" style="width: 600px;"/>
+
+</p>
+
+
+## Supported Environments
+
+Coach supports a large number of environments which can be solved using reinforcement learning:
+
+* **[DeepMind Control Suite](https://github.com/deepmind/dm_control)** - a set of reinforcement learning environments
+  powered by the MuJoCo physics engine.
+
+* **[Blizzard Starcraft II](https://github.com/deepmind/pysc2)** - a popular strategy game which was wrapped with a
+  python interface by DeepMind.
+
+* **[ViZDoom](http://vizdoom.cs.put.edu.pl/)** - a Doom-based AI research platform for reinforcement learning
+  from raw visual information.
+
+* **[CARLA](https://github.com/carla-simulator/carla)** - an open-source simulator for autonomous driving research.
+
+* **[OpenAI Gym](https://gym.openai.com/)** - a library which consists of a set of environments, from games to robotics.
+  Additionally, it can be extended using the API defined by the authors.
+
+  In Coach, we support all the native environments in Gym, along with several extensions such as:
+
+* **[Roboschool](https://github.com/openai/roboschool)** - a set of environments powered by the PyBullet engine,
+    that offer a free alternative to MuJoCo.
+
+* **[Gym Extensions](https://github.com/Breakend/gym-extensions)** - a set of environments that extends Gym for
+    auxiliary tasks (multitask learning, transfer learning, inverse reinforcement learning, etc.)
+
+* **[PyBullet](https://github.com/bulletphysics/bullet3/tree/master/examples/pybullet)** - a physics engine that
+    includes a set of robotics environments.
+
diff --git a/docs_raw/docs/design/filters.md b/docs_raw/docs/design/filters.md
new file mode 100644
index 0000000..8e308ae
--- /dev/null
+++ b/docs_raw/docs/design/filters.md
@@ -0,0 +1,116 @@
+# Filters
+
+Filters are a mechanism in Coach that allows doing pre-processing and post-processing of the internal agent information.
+There are two filter categories -
+
+* **Input filters** - these are filters that process the information passed **into** the agent from the environment.
+  This information includes the observation and the reward. Input filters therefore allow rescaling observations,
+  normalizing rewards, stack observations, etc.
+
+* **Output filters** - these are filters that process the information going **out** of the agent into the environment.
+  This information includes the action the agent chooses to take. Output filters therefore allow conversion of
+  actions from one space into another. For example, the agent can take $ N $ discrete actions, that will be mapped by
+  the output filter onto $ N $ continuous actions.
+
+Filters can be stacked on top of each other in order to build complex processing flows of the inputs or outputs.
+
+<p style="text-align: center;">
+
+<img src="../../img/filters.png" alt="Filters mechanism" style="width: 350px;"/>
+
+</p>
+
+## Input Filters
+
+The input filters are separated into two categories - **observation filters** and **reward filters**.
+
+### Observation Filters
+
+* **ObservationClippingFilter** - Clips the observation values to a given range of values. For example, if the
+  observation consists of measurements in an arbitrary range, and we want to control the minimum and maximum values
+  of these observations, we can define a range and clip the values of the measurements.
+
+* **ObservationCropFilter** - Crops the size of the observation to a given crop window. For example, in Atari, the
+  observations are images with a shape of 210x160. Usually, we will want to crop the size of the observation to a
+  square of 160x160 before rescaling them.
+
+* **ObservationMoveAxisFilter** - Reorders the axes of the observation. This can be useful when the observation is an
+  image, and we want to move the channel axis to be the last axis instead of the first axis.
+
+* **ObservationNormalizationFilter** - Normalizes the observation values with a running mean and standard deviation of
+  all the observations seen so far. The normalization is performed element-wise. Additionally, when working with
+  multiple workers, the statistics used for the normalization operation are accumulated over all the workers.
+
+* **ObservationReductionBySubPartsNameFilter** - Allows keeping only parts of the observation, by specifying their
+  name. For example, the CARLA environment extracts multiple measurements that can be used by the agent, such as
+  speed and location. If we want to only use the speed, it can be done using this filter.
+
+* **ObservationRescaleSizeByFactorFilter** - Rescales an image observation by some factor. For example, the image size
+  can be reduced by a factor of 2.
+
+* **ObservationRescaleToSizeFilter** - Rescales an image observation to a given size. The target size does not
+  necessarily keep the aspect ratio of the original observation.
+
+* **ObservationRGBToYFilter** - Converts a color image observation specified using the RGB encoding into a grayscale
+  image observation, by keeping only the luminance (Y) channel of the YUV encoding. This can be useful if the colors
+  in the original image are not relevant for solving the task at hand.
+
+* **ObservationSqueezeFilter** - Removes redundant axes from the observation, which are axes with a dimension of 1.
+
+* **ObservationStackingFilter** - Stacks several observations on top of each other. For image observation this will
+  create a 3D blob. The stacking is done in a lazy manner in order to reduce memory consumption. To achieve this,
+  a LazyStack object is used in order to wrap the observations in the stack. For this reason, the
+  ObservationStackingFilter **must** be the last filter in the inputs filters stack.
+
+* **ObservationUint8Filter** - Converts a floating point observation into an unsigned int 8 bit observation. This is
+  mostly useful for reducing memory consumption and is usually used for image observations. The filter will first
+  spread the observation values over the range 0-255 and then discretize them into integer values.
+
+### Reward Filters
+
+* **RewardClippingFilter** - Clips the reward values into a given range. For example, in DQN, the Atari rewards are
+  clipped into the range -1 and 1 in order to control the scale of the returns.
+
+* **RewardNormalizationFilter** -  Normalizes the reward values with a running mean and standard deviation of
+  all the rewards seen so far. When working with multiple workers, the statistics used for the normalization operation
+  are accumulated over all the workers.
+
+* **RewardRescaleFilter** - Rescales the reward by a given factor. Rescaling the rewards of the environment has been
+  observed to have a large effect (negative or positive) on the behavior of the learning process.
+
+## Output Filters
+
+The output filters only process the actions.
+
+### Action Filters
+
+* **AttentionDiscretization** - Discretizes an **AttentionActionSpace**. The attention action space defines the actions
+  as choosing sub-boxes in a given box. For example, consider an image of size 100x100, where the action is choosing
+  a crop window of size 20x20 to attend to in the image. AttentionDiscretization allows discretizing the possible crop
+  windows to choose into a finite number of options, and map a discrete action space into those crop windows.
+
+* **BoxDiscretization** - Discretizes a continuous action space into a discrete action space, allowing the usage of
+  agents such as DQN for continuous environments such as MuJoCo. Given the number of bins to discretize into, the
+  original continuous action space is uniformly separated into the given number of bins, each mapped to a discrete
+  action index. For example, if the original actions space is between -1 and 1 and 5 bins were selected, the new action
+  space will consist of 5 actions mapped to -1, -0.5, 0, 0.5 and 1.
+
+* **BoxMasking** - Masks part of the action space to enforce the agent to work in a defined space. For example,
+  if the original action space is between -1 and 1, then this filter can be used in order to constrain the agent actions
+  to the range 0 and 1 instead. This essentially masks the range -1 and 0 from the agent.
+
+* **PartialDiscreteActionSpaceMap** - Partial map of two countable action spaces. For example, consider an environment
+  with a MultiSelect action space (select multiple actions at the same time, such as jump and go right), with 8 actual
+  MultiSelect actions. If we want the agent to be able to select only 5 of those actions by their index (0-4), we can
+  map a discrete action space with 5 actions into the 5 selected MultiSelect actions. This will both allow the agent to
+  use regular discrete actions, and mask 3 of the actions from the agent.
+
+* **FullDiscreteActionSpaceMap** - Full map of two countable action spaces. This works in a similar way to the
+  PartialDiscreteActionSpaceMap, but maps the entire source action space into the entire target action space, without
+  masking any actions.
+
+* **LinearBoxToBoxMap** - A linear mapping of two box action spaces. For example, if the action space of the
+  environment consists of continuous actions between 0 and 1, and we want the agent to choose actions between -1 and 1,
+  the LinearBoxToBoxMap can be used to map the range -1 and 1 to the range 0 and 1 in a linear way. This means that the
+  action -1 will be mapped to 0, the action 1 will be mapped to 1, and the rest of the actions will be linearly mapped
+  between those values.
diff --git a/docs_raw/docs/design.md b/docs_raw/docs/design/network.md
similarity index 77%
rename from docs_raw/docs/design.md
rename to docs_raw/docs/design/network.md
index 2a0fce5..be79e0c 100644
--- a/docs_raw/docs/design.md
+++ b/docs_raw/docs/design/network.md
@@ -1,6 +1,4 @@
-# Coach Design
-
-## Network Design
+# Network Design
 
 Each agent has at least one neural network, used as the function approximator, for choosing the actions. The network is designed in a modular way to allow reusability in different agents. It is separated into three main parts:
 
@@ -21,7 +19,7 @@ Each agent has at least one neural network, used as the function approximator, f
 
 <p style="text-align: center;">
 
-<img src="../img/network.png" alt="Network Design" style="width: 400px;"/>
+<img src="../../img/network.png" alt="Network Design" style="width: 400px;"/>
 
 </p>
 
@@ -31,17 +29,7 @@ Most of the reinforcement learning agents include more than one copy of the neur
 
 <p style="text-align: center;">
 
-<img src="../img/distributed.png" alt="Distributed Training" style="width: 600px;"/>
-
-</p>
-
-## Supported Algorithms
-
-Coach supports many state-of-the-art reinforcement learning algorithms, which are separated into two main classes - value optimization and policy optimization. A detailed description of those algorithms may be found in the algorithms section.
-
-<p style="text-align: center;">
-
-<img src="../img/algorithms.png" alt="Supported Algorithms" style="width: 600px;"/>
+<img src="../../img/distributed.png" alt="Distributed Training" style="width: 600px;"/>
 
 </p>
 
diff --git a/docs_raw/docs/diagrams.xml b/docs_raw/docs/diagrams.xml
new file mode 100644
index 0000000..48cff1b
--- /dev/null
+++ b/docs_raw/docs/diagrams.xml
@@ -0,0 +1 @@
+<mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" version="9.0.0" editor="www.draw.io" type="device"><diagram id="33c2a640-8c1e-935c-0e0a-86b5dd5c932c" name="Page-1">7V1td5u4Ev41+dgehHj9mKRNe89pu93tuS/7kdjEZheDL8ZJc3/9lTDCoFGMbEsYXGX37NoyYMwzM5p5ZjS6wfern5+KaL38ms/j9Ma25j9v8Icb27ZxYJH/0ZHX3QiyPW83siiSeT22H/iR/C+uB+sTF9tkHm86B5Z5npbJujs4y7MsnpWdsago8pfuYU952v3WdbSIwcCPWZTC0X8n83K5Gw1caz/+OU4WS/bNyKo/eYxmfy+KfJvV33dj46fqb/fxKmLXqo/fLKN5/tIawh9v8H2R5+Xu1ernfZzSh8se2+68hzc+be67iLNS6gSMd6c8R+k2Zvdc3Vn5yp5G9Xtiega6wXcvy6SMf6yjGf30hQgAGVuWq7T+eJFGG/r0LfJ6lq+SWf16Uxb53/F9nuZFdVXszYL48an5hD1nTEaekjRtHTmP4uBpRsfzrKyFxbbq963jrOqPjEdpssjIWBo/lfRtMavP8sg7+Ijqp/YcF2X8szVUP7JPcb6Ky+KVHMI+dZiEMwG3wmA38LIXF4cJxbIlKtirn29Ui+iiufoeJvKiRkqMmhdoBG0ebZbNeQcR/OjRf2QQrHWAfFERzZN4j1qWZ7EKYH01wBJ8OrjalmdBXB0Brh6zDefgGloG11IRkn5XQ6mNBkhiEZLIUYEkAkhGZJ4axrKG/gfL9xXjh+wD+KkALHTeuxxkXvg+wPs/FwDYaEwbQCX42f2aGGfzW+prkHczCg0FpI1XM7fTJ5RGj3F613gHrYd4V/0jRkvy8ZOnXrz+h34TeYT12z/rL34TmjIqFnHZFdd43vGLIFitZ+8KHj0bK+I0KpPnrjclwqP+hu95Qu6ukYXGwjJJaBxDdo1Nvi1mcX1a278BV3J5qcIWd63dgwDXIthGr63D1vSAzTE3ja2D9+aG/pknIPfwCeCWuBPIi92v3OtEg7WUmvgSamImLMkJi3M9sOyEZfMCfYrBQzpdyuHBO2guO4KlPDiwuzrXBI0tGD1PFBmEKuYtg6IWFFEoiaLDz1QnKaP3y8OoKKDD/Awo0EaMRUZVhRcJZ8ftJiYDWVy+5MXfFatFf2ERzxMaHVgkRkjyjLyYJwSA5HG7ezvWqGHA8AB7XSBFs6O2aAByZP+scCyX9L/xzzX5yVGN3DpPk9krg3YTrdYpPeipyFfNGTXgBEKDdD/S/oBIOwDpWRFHJQUtytoKmmRPFN/88a94p7kZBXOn3knZwF/mxV5QmpN3B0dZFV7M5wkdjtL9VaPHfFuC04xsANlw8ICyASflRrsjGvMSVCsdb2DeZslTXqzSV17/myM2FWD2PR1L82xR/RLyQ8gn5P6odYifom3amRpqGaluY2OkogXuJQwGDJv2BoPiTaRikzB52CQVxDshIM++ZCZiXr1gdmbeHJPFL3SCyZ6pfMWbdZ5RAzNWyEdAIlpdsid0BhQF5pH0cL7kp5QcUdh59uyptR5yPcRIhxl5QDEZv6MPJplF6W39wSqZz+nXCIVgLyaWLNEiy4ZUx9W/UJCCPFehbYGvJyZCFID4axG/9Zw2Et4Xu13gkcchKk37WtyF+KhcEefL33Afg4sCQEZzwnkeH4sksvlXJL32dUovCjkp4ZnBC0mva2mWXhh9Fdsqpl5GVWBVjVGfKskqT8oj/jeRlexxs25EZvIz7UGVUT3TYj5NbvlgqkVCfkzFXOtK4b0k/vR2bdBW4B4jCbRtXWj7AO3ZMs83ByiO8yGt83PX6ijzfIgtCH0b30c5oDDZlD9u4uIZhqjT1M1LxjyBAMlAE5DsGhPxGn8mZctpJO/+rO/hJH9y55TdtKOhtovJ6mNH4mPyAYTju6f5mLzv5/GCpMjH5G/YRYd9TP6+uOPP9jEbxfzVZR1NQNi5OMjhWbpThd3HeoSdD5AuLuyQzOLSCFcxTw8bMbkhV1HoCehJZpaVT9W/FsHDSn5HYpB4fQ1OLUvF3PqR8Oya1I4AHWklIO3Sk3EyZkLCTHCiYkMjIUpEKTESkFfpFiLs0oos4ZhkzQezbVH95v2Hz1GRRI+pyToekXVElgXB1pZ2tCGvEs3nDaTEQBRJRicBqtAvUVF9lDeftzPUBmJZnnS/4rMNsSgpqQRiieLsic/67dCE2a9OqscftSeArNNjE+e91/rzna5nMFDmEllBz1qS8PAJZwcr+Eoi86NEfLTi7J8sztyVhkpe9gown/w4/gSwmL73t9uKVQRGf2TKpI40qCzk61O7tWf5tlxXFabs9LHO/Bf04twha8cwjMGu1/iNjGZ0ETcDd8VAlSX0ziYdz4jxMQwJOcvBh/ycxUiyaRiMy4cKvB0JgiHtiCDJ3orp4/VNHe/H62STz6siC1ZYsditZGWBjeF7juN7eA7A9QDsSBctjGGA2GV8OnXoLcJnFa/y+quMOktG/oKWMNoifyyxoPd6/IJgVH6BwyNvn1rQOVhQBIL0i4csTAlaAjyvdPjBzDIKZhnRemZfoBIqJhnnYq3iRrMqXVNzAdEyRnF3ASVrXGGsycUCe29hAwKBddFdwW4CgqPTQy6+WF8rB8aBVTn9lRjjS1ZsIsfpg1VXAacj0S9kRF7iuVUhrj0qPxH4UI4lFIRzi0SQEx687pB8kiNY95xnVHfemETId1adDW5oj9xyRpe4M27azBzyM4dYAoaYOZjSidofpEm1lD1/ArB32h4YsuFowB3/coBDKnEg6Ebj7itqQsXXwCNPQCEJu1CpaAnnQgopjaMi22tnbZCVe4BXvviqYcobWAVMP7I1Ub4eJFa26/muaK92llphmjG50ibXv1zXYW9SJTBvPvzOkqr6sYzFV3fCLtqnl3DzVwp5CRjSH2e7CkxDcnQu4atBGHW8SAyM5wXI8VwLB7xzECAl0SPfAmW/lYHuQqzAOZxzcL1zT3B6+l7DiFxx32sP0mhpntPk+lNOadRsu7ppN6qoUu80HiqTlaBp2zTJtkEzH0AGPAFnrmtBhScKhbyUYjZPnjtYev/d0r177iik72p0bmmITH3h5lPyqq622F1ls6bdJXdjaU5QqMfJjbU/ag1X38tGjTgdK0584wQUiBo8axInH7ruZnOQg3DZoQ+2cbicm+5DN12cRDEIHkLQsS+HoM5U9i++b4UrmJj1bbTkw2S2saXHaqIbXE4ToScd09cVkXUVfs2wnKQA3N5qA11paR+mpet5knVAN+1xTwOZC+2xYHG5tq59vg8wo8QK+4E1GF2D235qLWT3PJDfJoK4IgEfMEHHM1VRUTLuq77BauwhoT+z/kIBO0YG60Po7f4Vl+VrDXi0LXMylBflMl/Q7v1faPzfFtg3iLGDkvKmcLQpLjbhtikuX3ZJjTR3Jd0AHQYzUD7GwzyeW3vC/I2RcImOw3lfYJWR/E55AT9z+Jp6gsGb7uHzPJ7LPPoEtorr7V/Pe7HdE85mDINpNe7RR9D7IqUaV5LIcTlhaLZ7Plqp+Ct5mpbzA/llKwGkNeToE3CPDoJbwopJ+AC6l8lqXeSm/+jNKU4lv+mC1Uuw6YobgitZ9HS+rRR5eva4bCUvNt6p66V9vhs9uJI296PP9IHfeOwJdl9zFf6WbMXNVUKR+6Erg9U0bL9oDuvaq88E+6UPSQOEEh0rJlQeegxrvi8QDdRQssjiV4RZGK4ed0Sd+R3WOeQsLCEJS57Ymhbpr6IsWgjWd43PoxqDUu7bgB1sAyCaj5Xo5MX2DfZmQfz4JKOT8ygOnmYqlA8pSogQjDjQkGjrdVE1AOLZlpNQg2tp0vg5TrVqn9rZcRy6Bzo1CDKR+nRPImjRo3uB/Yg9qflw7sbB3Dld99SgxNU7IkHVfCAommetMM7bolFUYa3LDyaqS538a3KDx6HofJcLxBiDAYq3yBRvFF0GJQRRGlLRoUtrFH1qig7W5gyr6Bfzpqel6BhstupDlPQp+pC12UbR9Sg6RkDRgwEVXaK9ugmb+8NmW7SsWVvYTNxAAJuJm1XEzbZoHbOuuLnhWsw02wcTZyOxgN7QNs0iEzhfgabzIRnGA/rTyDaKflLgPKyiD5kpNoo+UOA8rKIbhkwKJRA4Y3vAwBkZhmz6ig4CZ4wHDJzRxRiy0PNxJKXoMSKq7l94Rnf52nrkCFLMiFULq9f1IUmyOHtOijxbGY3XofHIx1CW4LSB2FSiXukNW3YKbi4/2dsORE3MlqnoAoiQYcuU6J/Ll+rZnkD7tLFltmHLJGHiHSOBkdTmW9uGLbsCTedrObE9oG/N9Noo+mGU+LUWAyu6Ycumr+gev6vOsIpu2DIplPj1YUTRIUr6FN2wZdNXdN8Diu4PqOgXC5ynxZZ5FmA4sLAiQRdbZsNQ2bBl09R4DwHmFQtqSPWxZRjG64B2sZ6T+MXgKWcbOCfAEfRtRaGvC00Y14OFkgZNeS4bI8BlD4sn9L0NKaqCFEUC31wfKSrY3X4gz2paIRQgRZE3YAiFRdliE0JNTNN5ug35A4ZQ+GJ7kk1L0QEpOqyiiwJdo+jTUnRAig6r6GY1vRRKgBRFHkRJm6I7omS0UfRpKTogRRHr2z2EojuiTPkgit48Ufk2UrItAfflQ54YMQCPAMQDYTO/MbRvwaAZC6Mtfg+w0zCzAWYf2yzmIPANv0msZlSBHvqsLVeHChHYcjXLKR2Yyr5diPC82tZf5+GHIdXsC4p8fUHWQo0lhSRIkq23tMskgaIUUFmTdaAa+VFiTUMOtVBUmi3axlfF/ifIgSFtvi0Nbv36FgLyeGDkJPK5g7YpPojBW82IUQMH11G4dyuE2uJAhIbpPEyADTrwwx1OZVsPI7CjiiXZp/2EVrzIgbFt/riJi2fyNPKMhTKPhTB+ovC+21R40/AJ+eufu/iHi6CK+CUq5ooudrQLQATvofo74I0N7Awg/4BydNXKRopMlI15CRU46qKt0ZQ4BK4oOh+vfXqzjbq8dWr3Tq8n1otZJ8d1VFmn/ksptE4uDMij2YmGyRgSRYak6fJ8GUPiQruhdsevvt0Sprzh10F5OdG2sR3tu+7YGwZP+ZZfzdd3Qk2ie+Sp0EvvWAMrzRfkmQ4TvJy3NPAgRJqCzq5nsCfy2xQe65jSIXvUqDTWoNIH9vAzGt3nrQi2eqkHB9FoiUUII/IV94Jlg2D2bXk8FRpXAM3OBF7Mu8ShBXb/OHXnP9HFLN7KqPQwJXgTIktr+jJZRRStxqX7QsXqe75JKo8Uf3jMyzJfHZS3xk/k3cKSavtdtFnHM4rOU/KTmrq76itv2ajFRsjrZVmuN5V7+0D+XSTlcvv4nlhF8uYbDd+z6MfrpoxXGzIwy6PZkvy/iF7oA4k2FZH3kKwW5L9RusgLcvpq836dLdRMKQHXxW+/tXInEywgw9SwYW6oeUa5Kh8RcTPKobzTG2arY6PYGvSOjQrkheg4e3TsjlvI4nJbtl8L51s2KQgPn3Bz7oZYiIXQRl7HI6/WZOU1xJZueUVGXscmr/Z05TXQbl8FAYSR18vKK56svIId8jQIrG5KxAjs0QLrqhdY6YDGc4w8jE0enOkasKbPvT4DpiNNYwT2LIH1Jiyw/CbW6gXWMwI7NoH1JyywfqBbYH0jsGMT2HC6Amv3uAQSZwS47wzsHjxDgVIERinGphTBhJXC0c5Mwfrbf1Uvbeu3dZmskv/VhbicVI+x7P6g/Khe4hKCpAyrrW/X24vK0JQsUPJNxuaChsZ3BIbGG83s6/Azo9fLKPKmyVNtaHyTshmbwPoa4ht5A2ZSImOTh3A0KefjDZjL55zVGzCTEhmbwPoTnnH5ljIaBBaWsYrXsP/yvjzCPL3r4UGdeZOsuKRp8UVz4WjKBYFp6WUNmo2TtLEGvklWjE5gNZQHDCWwDgsO9QksbCzxPU+T2avhuY6eGwcmugyjfklTI2LUQw1ut7w8QML6H6ukNNr7RotaUNs6qPayKvYWWh9+/zYQTmeugx00m2DxMAn6nYnW0KlBCbY5+P2PdwYo0coy3rcOHAiUrw0oGwB1l+cleWzRek1gaXWl2PWYqDDkBw2qQP1AYYAAVWQJYFWyXDCA3QY/GO2TSroOaiYhofb1671BCTgdFnA6Qvv9kEDB7o/fPhqgBEDxsZ3I7umDCe6B8e3djzJewynr93df4qjIkmxhpjMZXHk7KcI1FOHqqsBVQLbcfjHASQCH+KyQHzoAOIZlBzgVsXUAm1F8u30wwEkAZ4MleQLg9FlSSIrcR2VMG2vMaBsqExacqI58K9RB7WgoIE8evhvgJIBzAZ8ioL20qWMI+ZQm2/Cp7rtvYJSxqiG/iiXEAMZm33n1ONoAE5N7GCz3wHzYbu5hNIsIQdKyd/kJZsKrbflJaGrURiewlyyyZT6okYfxyMNoliMdb8BcJzh4hgIDBpm721lJnr5t3RdJSVvoGrep121ysN3vNrHCCvVeE6T1Pnz4/skgJ4Ec9ngNc/whmfMQUnf3abKucovW9++/GRBl1M92ZEBkk7N6ECGNZ6CT5A1s3kUfWP8gkXcXL6PnJC+qdvL3aW6SILJYwmQxpGSbDvKqobRZtYDxvS/ge9uWwPe2rQvGYrZlFsCOTR5YofoIYjHU9fuwhfrIJNs7eMbZsZhtGfZzbAJrjYf9PFpgbYR0C6xhP0cnsONhu44WWLfnDAUCa+jZ0QnsiJaUHyuwHlfFrEFgIT37iQSTy69RFi0msjnz0KvnOEIvFOxEo239TeMwtPmEiKqkbf3xxcAFC3489z2vRLIlP0rwgvzrZwOUSK9YbcYhmNjWjOphggzr121aJu8i02BDTIn7joxescph9YBBXvVHnD69W6fRq4FLQJ1a/WBpM4II1j1+zDbx6pH8QIMVqJELcD9WbEWMeqxgqePn23uTn5CZwmzgazT7IgyQa7LZhrot6B62ZtGhyNfAl4QJLg6lHUvy7B2riDGAgfoXHjBPMH3p8jWQYJnolx+fDU5QsVyvHye2d5p6nCCZ8fV2V6lkkOr330VGUJFOkbdFnpdtnqpimfJ5TI/4Pw==</diagram></mxfile>
\ No newline at end of file
diff --git a/docs_raw/docs/img/act.png b/docs_raw/docs/img/act.png
new file mode 100644
index 0000000..7522c60
Binary files /dev/null and b/docs_raw/docs/img/act.png differ
diff --git a/docs_raw/docs/img/filters.png b/docs_raw/docs/img/filters.png
new file mode 100644
index 0000000..da4306c
Binary files /dev/null and b/docs_raw/docs/img/filters.png differ
diff --git a/docs_raw/docs/img/graph.png b/docs_raw/docs/img/graph.png
new file mode 100644
index 0000000..3e93dcd
Binary files /dev/null and b/docs_raw/docs/img/graph.png differ
diff --git a/docs_raw/docs/img/improve.png b/docs_raw/docs/img/improve.png
new file mode 100644
index 0000000..8942665
Binary files /dev/null and b/docs_raw/docs/img/improve.png differ
diff --git a/docs_raw/docs/img/level.png b/docs_raw/docs/img/level.png
new file mode 100644
index 0000000..29ae66f
Binary files /dev/null and b/docs_raw/docs/img/level.png differ
diff --git a/docs_raw/docs/img/observe.png b/docs_raw/docs/img/observe.png
new file mode 100644
index 0000000..1757132
Binary files /dev/null and b/docs_raw/docs/img/observe.png differ
diff --git a/docs_raw/docs/img/train.png b/docs_raw/docs/img/train.png
new file mode 100644
index 0000000..5ae8491
Binary files /dev/null and b/docs_raw/docs/img/train.png differ
diff --git a/docs_raw/docs/index.md b/docs_raw/docs/index.md
index 20702f4..87f9a3b 100644
--- a/docs_raw/docs/index.md
+++ b/docs_raw/docs/index.md
@@ -13,7 +13,7 @@ Coach collects statistics from the training process and supports advanced visual
 
 
 
-Blog post from the Intel® Nervana™ website can be found [here](https://www.intelnervana.com/reinforcement-learning-coach-intel). 
+Blog post from the Intel® AI website can be found [here](https://ai.intel.com/reinforcement-learning-coach-intel/).
 
 GitHub repository is [here](https://github.com/NervanaSystems/coach). 
 
diff --git a/docs_raw/mkdocs.yml b/docs_raw/mkdocs.yml
index de96d95..6d0e816 100644
--- a/docs_raw/mkdocs.yml
+++ b/docs_raw/mkdocs.yml
@@ -1,6 +1,6 @@
-site_name: Reinforcement Learning Coach Documentation
+site_name: Reinforcement Learning Coach
 theme: readthedocs
-site_description: 'Reinforcement Learning Coach Documentation by Intel Nervana.'
+site_description: 'Reinforcement Learning Coach by Intel Nervana.'
 markdown_extensions: 
 - mdx_math:
     enable_dollar_delimiter: True #for use of inline $..$
@@ -10,8 +10,13 @@ extra_css: [extra.css]
 
 pages:
 - Home : index.md
-- Design: design.md
 - Usage: usage.md
+- Design:
+        - 'Features' : design/features.md
+        - 'Control Flow' : design/control_flow.md
+        - 'Network' : design/network.md
+        - 'Filters' : design/filters.md
+
 - Algorithms:
         - 'DQN' : algorithms/value_optimization/dqn.md
         - 'Double DQN' : algorithms/value_optimization/double_dqn.md
diff --git a/environments/CarlaSettings.ini b/environments/CarlaSettings.ini
deleted file mode 100644
index 236500f..0000000
--- a/environments/CarlaSettings.ini
+++ /dev/null
@@ -1,62 +0,0 @@
-[CARLA/Server]
-; If set to false, a mock controller will be used instead of waiting for a real
-; client to connect.
-UseNetworking=true
-; Ports to use for the server-client communication. This can be overridden by
-; the command-line switch `-world-port=N`, write and read ports will be set to
-; N+1 and N+2 respectively.
-WorldPort=2000
-; Time-out in milliseconds for the networking operations.
-ServerTimeOut=10000000000
-; In synchronous mode, CARLA waits every frame until the control from the client
-; is received.
-SynchronousMode=true
-; Send info about every non-player agent in the scene every frame, the
-; information is attached to the measurements message. This includes other
-; vehicles, pedestrians and traffic signs. Disabled by default to improve
-; performance.
-SendNonPlayerAgentsInfo=false
-
-[CARLA/LevelSettings]
-; Path of the vehicle class to be used for the player. Leave empty for default.
-; Paths follow the pattern "/Game/Blueprints/Vehicles/Mustang/Mustang.Mustang_C"
-PlayerVehicle=
-; Number of non-player vehicles to be spawned into the level.
-NumberOfVehicles=15
-; Number of non-player pedestrians to be spawned into the level.
-NumberOfPedestrians=30
-; Index of the weather/lighting presets to use. If negative, the default presets
-; of the map will be used.
-WeatherId=1
-; Seeds for the pseudo-random number generators.
-SeedVehicles=123456789
-SeedPedestrians=123456789
-
-[CARLA/SceneCapture]
-; Names of the cameras to be attached to the player, comma-separated, each of
-; them should be defined in its own subsection. E.g., Uncomment next line to add
-; a camera called MyCamera to the vehicle
-
-Cameras=CameraRGB
-
-; Now, every camera we added needs to be defined it in its own subsection.
-[CARLA/SceneCapture/CameraRGB]
-; Post-processing effect to be applied. Valid values:
-;   * None                  No effects applied.
-;   * SceneFinal            Post-processing present at scene (bloom, fog, etc).
-;   * Depth                 Depth map ground-truth only.
-;   * SemanticSegmentation  Semantic segmentation ground-truth only.
-PostProcessing=SceneFinal
-; Size of the captured image in pixels.
-ImageSizeX=360
-ImageSizeY=256
-; Camera (horizontal) field of view in degrees.
-CameraFOV=90
-; Position of the camera relative to the car in centimeters.
-CameraPositionX=200
-CameraPositionY=0
-CameraPositionZ=140
-; Rotation of the camera relative to the car in degrees.
-CameraRotationPitch=0
-CameraRotationRoll=0
-CameraRotationYaw=0
diff --git a/environments/__init__.py b/environments/__init__.py
deleted file mode 100644
index 1dd8d1d..0000000
--- a/environments/__init__.py
+++ /dev/null
@@ -1,36 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from logger import *
-from utils import Enum, get_open_port
-from environments.gym_environment_wrapper import *
-from environments.doom_environment_wrapper import *
-from environments.carla_environment_wrapper import *
-
-
-class EnvTypes(Enum):
-    Doom = "DoomEnvironmentWrapper"
-    Gym = "GymEnvironmentWrapper"
-    Carla = "CarlaEnvironmentWrapper"
-
-
-def create_environment(tuning_parameters):
-    env_type_name, env_type = EnvTypes().verify(tuning_parameters.env.type)
-    env = eval(env_type)(tuning_parameters)
-    return env
-
-
-
diff --git a/environments/carla_environment_wrapper.py b/environments/carla_environment_wrapper.py
deleted file mode 100644
index 3927043..0000000
--- a/environments/carla_environment_wrapper.py
+++ /dev/null
@@ -1,232 +0,0 @@
-import sys
-from os import path, environ
-
-try:
-    if 'CARLA_ROOT' in environ:
-        sys.path.append(path.join(environ.get('CARLA_ROOT'), 'PythonClient'))
-    from carla.client import CarlaClient
-    from carla.settings import CarlaSettings
-    from carla.tcp import TCPConnectionError
-    from carla.sensor import Camera
-    from carla.client import VehicleControl
-except ImportError:
-    from logger import failed_imports
-    failed_imports.append("CARLA")
-
-import numpy as np
-import time
-import logging
-import subprocess
-import signal
-from environments.environment_wrapper import EnvironmentWrapper
-from utils import *
-from logger import screen, logger
-from PIL import Image
-
-
-# enum of the available levels and their path
-class CarlaLevel(Enum):
-    TOWN1 = "/Game/Maps/Town01"
-    TOWN2 = "/Game/Maps/Town02"
-
-key_map = {
-    'BRAKE': (274,),  # down arrow
-    'GAS': (273,),  # up arrow
-    'TURN_LEFT': (276,),  # left arrow
-    'TURN_RIGHT': (275,),  # right arrow
-    'GAS_AND_TURN_LEFT': (273, 276),
-    'GAS_AND_TURN_RIGHT': (273, 275),
-    'BRAKE_AND_TURN_LEFT': (274, 276),
-    'BRAKE_AND_TURN_RIGHT': (274, 275),
-}
-
-
-class CarlaEnvironmentWrapper(EnvironmentWrapper):
-    def __init__(self, tuning_parameters):
-        EnvironmentWrapper.__init__(self, tuning_parameters)
-
-        self.tp = tuning_parameters
-
-        # server configuration
-        self.server_height = self.tp.env.server_height
-        self.server_width = self.tp.env.server_width
-        self.port = get_open_port()
-        self.host = 'localhost'
-        self.map = CarlaLevel().get(self.tp.env.level)
-
-        # client configuration
-        self.verbose = self.tp.env.verbose
-        self.depth = self.tp.env.depth
-        self.stereo = self.tp.env.stereo
-        self.semantic_segmentation = self.tp.env.semantic_segmentation
-        self.height = self.server_height * (1 + int(self.depth) + int(self.semantic_segmentation))
-        self.width = self.server_width * (1 + int(self.stereo))
-        self.size = (self.width, self.height)
-
-        self.config = self.tp.env.config
-        if self.config:
-            # load settings from file
-            with open(self.config, 'r') as fp:
-                self.settings = fp.read()
-        else:
-            # hard coded settings
-            self.settings = CarlaSettings()
-            self.settings.set(
-                SynchronousMode=True,
-                SendNonPlayerAgentsInfo=False,
-                NumberOfVehicles=15,
-                NumberOfPedestrians=30,
-                WeatherId=1)
-            self.settings.randomize_seeds()
-
-            # add cameras
-            camera = Camera('CameraRGB')
-            camera.set_image_size(self.width, self.height)
-            camera.set_position(200, 0, 140)
-            camera.set_rotation(0, 0, 0)
-            self.settings.add_sensor(camera)
-
-        # open the server
-        self.server = self._open_server()
-
-        logging.disable(40)
-
-        # open the client
-        self.game = CarlaClient(self.host, self.port, timeout=99999999)
-        self.game.connect()
-        scene = self.game.load_settings(self.settings)
-
-        # get available start positions
-        positions = scene.player_start_spots
-        self.num_pos = len(positions)
-        self.iterator_start_positions = 0
-
-        # action space
-        self.discrete_controls = False
-        self.action_space_size = 2
-        self.action_space_high = [1, 1]
-        self.action_space_low = [-1, -1]
-        self.action_space_abs_range = np.maximum(np.abs(self.action_space_low), np.abs(self.action_space_high))
-        self.steering_strength = 0.5
-        self.gas_strength = 1.0
-        self.brake_strength = 0.5
-        self.actions = {0: [0., 0.],
-                        1: [0., -self.steering_strength],
-                        2: [0., self.steering_strength],
-                        3: [self.gas_strength, 0.],
-                        4: [-self.brake_strength, 0],
-                        5: [self.gas_strength, -self.steering_strength],
-                        6: [self.gas_strength, self.steering_strength],
-                        7: [self.brake_strength, -self.steering_strength],
-                        8: [self.brake_strength, self.steering_strength]}
-        self.actions_description = ['NO-OP', 'TURN_LEFT', 'TURN_RIGHT', 'GAS', 'BRAKE',
-                                    'GAS_AND_TURN_LEFT', 'GAS_AND_TURN_RIGHT',
-                                    'BRAKE_AND_TURN_LEFT', 'BRAKE_AND_TURN_RIGHT']
-        for idx, action in enumerate(self.actions_description):
-            for key in key_map.keys():
-                if action == key:
-                    self.key_to_action[key_map[key]] = idx
-        self.num_speedup_steps = 30
-
-        # measurements
-        self.measurements_size = (1,)
-        self.autopilot = None
-
-        # env initialization
-        self.reset(True)
-
-        # render
-        if self.is_rendered:
-            image = self.get_rendered_image()
-            self.renderer.create_screen(image.shape[1], image.shape[0])
-
-    def _open_server(self):
-        log_path = path.join(logger.experiments_path, "CARLA_LOG_{}.txt".format(self.port))
-        with open(log_path, "wb") as out:
-            cmd = [path.join(environ.get('CARLA_ROOT'), 'CarlaUE4.sh'), self.map,
-                                  "-benchmark", "-carla-server", "-fps=10", "-world-port={}".format(self.port),
-                                  "-windowed -ResX={} -ResY={}".format(self.server_width, self.server_height),
-                                  "-carla-no-hud"]
-            if self.config:
-                cmd.append("-carla-settings={}".format(self.config))
-            p = subprocess.Popen(cmd, stdout=out, stderr=out)
-
-        return p
-
-    def _close_server(self):
-        os.killpg(os.getpgid(self.server.pid), signal.SIGKILL)
-
-    def _update_state(self):
-        # get measurements and observations
-        measurements = []
-        while type(measurements) == list:
-            measurements, sensor_data = self.game.read_data()
-
-        self.location = (measurements.player_measurements.transform.location.x,
-                         measurements.player_measurements.transform.location.y,
-                         measurements.player_measurements.transform.location.z)
-
-        is_collision = measurements.player_measurements.collision_vehicles != 0 \
-                       or measurements.player_measurements.collision_pedestrians != 0 \
-                       or measurements.player_measurements.collision_other != 0
-
-        speed_reward = measurements.player_measurements.forward_speed - 1
-        if speed_reward > 30.:
-            speed_reward = 30.
-        self.reward = speed_reward \
-                      - (measurements.player_measurements.intersection_otherlane * 5) \
-                      - (measurements.player_measurements.intersection_offroad * 5) \
-                      - is_collision * 100 \
-                      - np.abs(self.control.steer) * 10
-
-        # update measurements
-        self.state = {
-            'observation': sensor_data['CameraRGB'].data,
-            'measurements': [measurements.player_measurements.forward_speed],
-        }
-        self.autopilot = measurements.player_measurements.autopilot_control
-
-        # action_p = ['%.2f' % member for member in [self.control.throttle, self.control.steer]]
-        # screen.success('REWARD: %.2f, ACTIONS: %s' % (self.reward, action_p))
-
-        if (measurements.game_timestamp >= self.tp.env.episode_max_time) or is_collision:
-            # screen.success('EPISODE IS DONE. GameTime: {}, Collision: {}'.format(str(measurements.game_timestamp),
-            #                                                                      str(is_collision)))
-            self.done = True
-
-    def _take_action(self, action_idx):
-        if type(action_idx) == int:
-            action = self.actions[action_idx]
-        else:
-            action = action_idx
-        self.last_action_idx = action
-
-        self.control = VehicleControl()
-        self.control.throttle = np.clip(action[0], 0, 1)
-        self.control.steer = np.clip(action[1], -1, 1)
-        self.control.brake = np.abs(np.clip(action[0], -1, 0))
-        if not self.tp.env.allow_braking:
-            self.control.brake = 0
-        self.control.hand_brake = False
-        self.control.reverse = False
-
-        self.game.send_control(self.control)
-
-    def _restart_environment_episode(self, force_environment_reset=False):
-        self.iterator_start_positions += 1
-        if self.iterator_start_positions >= self.num_pos:
-            self.iterator_start_positions = 0
-
-        try:
-            self.game.start_episode(self.iterator_start_positions)
-        except:
-            self.game.connect()
-            self.game.start_episode(self.iterator_start_positions)
-
-        # start the game with some initial speed
-        state = None
-        for i in range(self.num_speedup_steps):
-            state = self.step([1.0, 0])['state']
-        self.state = state
-
-        return state
diff --git a/environments/doom_environment_wrapper.py b/environments/doom_environment_wrapper.py
deleted file mode 100644
index 3483244..0000000
--- a/environments/doom_environment_wrapper.py
+++ /dev/null
@@ -1,161 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-
-try:
-    import vizdoom
-except ImportError:
-    from logger import failed_imports
-    failed_imports.append("ViZDoom")
-
-import numpy as np
-from environments.environment_wrapper import EnvironmentWrapper
-from os import path, environ
-from utils import *
-from logger import *
-
-
-# enum of the available levels and their path
-class DoomLevel(Enum):
-    BASIC = "basic.cfg"
-    DEFEND = "defend_the_center.cfg"
-    DEATHMATCH = "deathmatch.cfg"
-    MY_WAY_HOME = "my_way_home.cfg"
-    TAKE_COVER = "take_cover.cfg"
-    HEALTH_GATHERING = "health_gathering.cfg"
-    HEALTH_GATHERING_SUPREME = "health_gathering_supreme.cfg"
-    DEFEND_THE_LINE = "defend_the_line.cfg"
-    DEADLY_CORRIDOR = "deadly_corridor.cfg"
-
-key_map = {
-    'NO-OP': 96,  # `
-    'ATTACK': 13,  # enter
-    'CROUCH': 306,  # ctrl
-    'DROP_SELECTED_ITEM': ord("t"),
-    'DROP_SELECTED_WEAPON': ord("t"),
-    'JUMP': 32,  # spacebar
-    'LAND': ord("l"),
-    'LOOK_DOWN': 274,  # down arrow
-    'LOOK_UP': 273,  # up arrow
-    'MOVE_BACKWARD': ord("s"),
-    'MOVE_DOWN': ord("s"),
-    'MOVE_FORWARD': ord("w"),
-    'MOVE_LEFT': 276,
-    'MOVE_RIGHT': 275,
-    'MOVE_UP': ord("w"),
-    'RELOAD': ord("r"),
-    'SELECT_NEXT_WEAPON': ord("q"),
-    'SELECT_PREV_WEAPON': ord("e"),
-    'SELECT_WEAPON0': ord("0"),
-    'SELECT_WEAPON1': ord("1"),
-    'SELECT_WEAPON2': ord("2"),
-    'SELECT_WEAPON3': ord("3"),
-    'SELECT_WEAPON4': ord("4"),
-    'SELECT_WEAPON5': ord("5"),
-    'SELECT_WEAPON6': ord("6"),
-    'SELECT_WEAPON7': ord("7"),
-    'SELECT_WEAPON8': ord("8"),
-    'SELECT_WEAPON9': ord("9"),
-    'SPEED': 304,  # shift
-    'STRAFE': 9,  # tab
-    'TURN180': ord("u"),
-    'TURN_LEFT': ord("a"),  # left arrow
-    'TURN_RIGHT': ord("d"),  # right arrow
-    'USE': ord("f"),
-}
-
-
-class DoomEnvironmentWrapper(EnvironmentWrapper):
-    def __init__(self, tuning_parameters):
-        EnvironmentWrapper.__init__(self, tuning_parameters)
-
-        # load the emulator with the required level
-        self.level = DoomLevel().get(self.tp.env.level)
-        self.scenarios_dir = path.join(environ.get('VIZDOOM_ROOT'), 'scenarios')
-        self.game = vizdoom.DoomGame()
-        self.game.load_config(path.join(self.scenarios_dir, self.level))
-        self.game.set_window_visible(False)
-        self.game.add_game_args("+vid_forcesurface 1")
-
-        self.wait_for_explicit_human_action = True
-        if self.human_control:
-            self.game.set_screen_resolution(vizdoom.ScreenResolution.RES_640X480)
-            self.renderer.create_screen(640, 480)
-        elif self.is_rendered:
-            self.game.set_screen_resolution(vizdoom.ScreenResolution.RES_320X240)
-            self.renderer.create_screen(320, 240)
-        else:
-            # lower resolution since we actually take only 76x60 and we don't need to render
-            self.game.set_screen_resolution(vizdoom.ScreenResolution.RES_160X120)
-
-        self.game.set_render_hud(False)
-        self.game.set_render_crosshair(False)
-        self.game.set_render_decals(False)
-        self.game.set_render_particles(False)
-        self.game.init()
-
-        # action space
-        self.action_space_abs_range = 0
-        self.actions = {}
-        self.action_space_size = self.game.get_available_buttons_size() + 1
-        self.action_vector_size = self.action_space_size - 1
-        self.actions[0] = [0] * self.action_vector_size
-        for action_idx in range(self.action_vector_size):
-            self.actions[action_idx + 1] = [0] * self.action_vector_size
-            self.actions[action_idx + 1][action_idx] = 1
-        self.actions_description = ['NO-OP']
-        self.actions_description += [str(action).split(".")[1] for action in self.game.get_available_buttons()]
-        for idx, action in enumerate(self.actions_description):
-            if action in key_map.keys():
-                self.key_to_action[(key_map[action],)] = idx
-
-        # measurement
-        self.measurements_size = self.game.get_state().game_variables.shape
-
-        self.width = self.game.get_screen_width()
-        self.height = self.game.get_screen_height()
-        if self.tp.seed is not None:
-            self.game.set_seed(self.tp.seed)
-        self.reset()
-
-    def _update_state(self):
-        # extract all data from the current state
-        state = self.game.get_state()
-        if state is not None and state.screen_buffer is not None:
-            self.state = {
-                'observation': state.screen_buffer,
-                'measurements': state.game_variables,
-            }
-        self.reward = self.game.get_last_reward()
-        self.done = self.game.is_episode_finished()
-
-    def _take_action(self, action_idx):
-        self.game.make_action(self._idx_to_action(action_idx), self.frame_skip)
-
-    def _preprocess_observation(self, observation):
-        if observation is None:
-            return None
-
-        # for the last step we get no new observation, so we shouldn't preprocess it
-        if self.done:
-            return observation
-
-        # move the channel to the last axis
-        observation = np.transpose(observation, (1, 2, 0))
-        return observation
-
-    def _restart_environment_episode(self, force_environment_reset=False):
-        self.game.new_episode()
diff --git a/environments/environment_wrapper.py b/environments/environment_wrapper.py
deleted file mode 100644
index d077c39..0000000
--- a/environments/environment_wrapper.py
+++ /dev/null
@@ -1,264 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import numpy as np
-from utils import *
-from configurations import Preset
-from renderer import Renderer
-import operator
-import time
-
-
-class EnvironmentWrapper(object):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters:
-        :type tuning_parameters: Preset
-        """
-        # env initialization
-        self.game = []
-        self.actions = {}
-        self.state = []
-        self.reward = 0
-        self.done = False
-        self.default_action = 0
-        self.last_action_idx = 0
-        self.episode_idx = 0
-        self.last_episode_time = time.time()
-        self.info = []
-        self.action_space_low = 0
-        self.action_space_high = 0
-        self.action_space_abs_range = 0
-        self.actions_description = {}
-        self.discrete_controls = True
-        self.action_space_size = 0
-        self.key_to_action = {}
-        self.width = 1
-        self.height = 1
-        self.is_state_type_image = True
-        self.measurements_size = 0
-        self.phase = RunPhase.TRAIN
-        self.tp = tuning_parameters
-        self.record_video_every = self.tp.visualization.record_video_every
-        self.env_id = self.tp.env.level
-        self.video_path = self.tp.visualization.video_path
-        self.is_rendered = self.tp.visualization.render
-        self.seed = self.tp.seed
-        self.frame_skip = self.tp.env.frame_skip
-        self.human_control = self.tp.env.human_control
-        self.wait_for_explicit_human_action = False
-        self.is_rendered = self.is_rendered or self.human_control
-        self.game_is_open = True
-        self.renderer = Renderer()
-
-    @property
-    def measurements(self):
-        assert False
-
-    @measurements.setter
-    def measurements(self, value):
-        assert False
-
-    @property
-    def observation(self):
-        assert False
-
-    @observation.setter
-    def observation(self, value):
-        assert False
-
-    def _idx_to_action(self, action_idx):
-        """
-        Convert an action index to one of the environment available actions.
-        For example, if the available actions are 4,5,6 then this function will map 0->4, 1->5, 2->6
-        :param action_idx: an action index between 0 and self.action_space_size - 1
-        :return: the action corresponding to the requested index
-        """
-        return self.actions[action_idx]
-
-    def _action_to_idx(self, action):
-        """
-        Convert an environment action to one of the available actions of the wrapper.
-        For example, if the available actions are 4,5,6 then this function will map 4->0, 5->1, 6->2
-        :param action: the environment action
-        :return: an action index between 0 and self.action_space_size - 1, or -1 if the action does not exist
-        """
-        for key, val in self.actions.items():
-            if val == action:
-                return key
-        return -1
-
-    def get_action_from_user(self):
-        """
-        Get an action from the user keyboard
-        :return: action index
-        """
-        if self.wait_for_explicit_human_action:
-            while len(self.renderer.pressed_keys) == 0:
-                self.renderer.get_events()
-
-        if self.key_to_action == {}:
-            # the keys are the numbers on the keyboard corresponding to the action index
-            if len(self.renderer.pressed_keys) > 0:
-                action_idx = self.renderer.pressed_keys[0] - ord("1")
-                if 0 <= action_idx < self.action_space_size:
-                    return action_idx
-        else:
-            # the keys are mapped through the environment to more intuitive keyboard keys
-            # key = tuple(self.renderer.pressed_keys)
-            # for key in self.renderer.pressed_keys:
-            for env_keys in self.key_to_action.keys():
-                if set(env_keys) == set(self.renderer.pressed_keys):
-                    return self.key_to_action[env_keys]
-
-        # return the default action 0 so that the environment will continue running
-        return self.default_action
-
-    def step(self, action_idx):
-        """
-        Perform a single step on the environment using the given action
-        :param action_idx: the action to perform on the environment
-        :return: A dictionary containing the state, reward, done flag and action
-        """
-        self.last_action_idx = action_idx
-
-        self._take_action(action_idx)
-
-        self._update_state()
-
-        if self.is_rendered:
-            self.render()
-
-        self.state = self._preprocess_state(self.state)
-
-        return {'state': self.state,
-                'reward': self.reward,
-                'done': self.done,
-                'action': self.last_action_idx,
-                'info': self.info}
-
-    def render(self):
-        """
-        Call the environment function for rendering to the screen
-        """
-        self.renderer.render_image(self.get_rendered_image())
-
-    def reset(self, force_environment_reset=False):
-        """
-        Reset the environment and all the variable of the wrapper
-        :param force_environment_reset: forces environment reset even when the game did not end
-        :return: A dictionary containing the state, reward, done flag and action
-        """
-        self._restart_environment_episode(force_environment_reset)
-        self.last_episode_time = time.time()
-        self.done = False
-        self.episode_idx += 1
-        self.reward = 0.0
-        self.last_action_idx = 0
-        self._update_state()
-
-        # render before the preprocessing of the state, so that the image will be in its original quality
-        if self.is_rendered:
-            self.render()
-
-        # TODO BUG: if the environment has not been reset, _preprocessed_state will be running on an already preprocessed state
-        # TODO: see also _update_state above
-        self.state = self._preprocess_state(self.state)
-
-        return {'state': self.state,
-                'reward': self.reward,
-                'done': self.done,
-                'action': self.last_action_idx,
-                'info': self.info}
-
-    def get_random_action(self):
-        """
-        Returns an action picked uniformly from the available actions
-        :return: a numpy array with a random action
-        """
-        if self.discrete_controls:
-            return np.random.choice(self.action_space_size)
-        else:
-            return np.random.uniform(self.action_space_low, self.action_space_high)
-
-    def change_phase(self, phase):
-        """
-        Change the current phase of the run.
-        This is useful when different behavior is expected when testing and training
-        :param phase: The running phase of the algorithm
-        :type phase: RunPhase
-        """
-        self.phase = phase
-
-    def get_available_keys(self):
-        """
-        Return a list of tuples mapping between action names and the keyboard key that triggers them
-        :return: a list of tuples mapping between action names and the keyboard key that triggers them
-        """
-        available_keys = []
-        if self.key_to_action != {}:
-            for key, idx in sorted(self.key_to_action.items(), key=operator.itemgetter(1)):
-                if key != ():
-                    key_names = [self.renderer.get_key_names([k])[0] for k in key]
-                    available_keys.append((self.actions_description[idx], ' + '.join(key_names)))
-        elif self.discrete_controls:
-            for action in range(self.action_space_size):
-                available_keys.append(("Action {}".format(action + 1), action + 1))
-        return available_keys
-
-    # The following functions define the interaction with the environment.
-    # Any new environment that inherits the EnvironmentWrapper class should use these signatures.
-    # Some of these functions are optional - please read their description for more details.
-
-    def _take_action(self, action_idx):
-        """
-        An environment dependent function that sends an action to the simulator.
-        :param action_idx: the action to perform on the environment
-        :return: None
-        """
-        pass
-
-    def _preprocess_state(self, state):
-        """
-        Do initial state preprocessing such as cropping, rgb2gray, rescale etc.
-        Implementing this function is optional.
-        :param state: a raw state from the environment
-        :return: the preprocessed state
-        """
-        return state
-
-    def _update_state(self):
-        """
-        Updates the state from the environment.
-        Should update self.state, self.reward, self.done and self.info
-        :return: None
-        """
-        pass
-
-    def _restart_environment_episode(self, force_environment_reset=False):
-        """
-        :param force_environment_reset: Force the environment to reset even if the episode is not done yet.
-        :return:
-        """
-        pass
-
-    def get_rendered_image(self):
-        """
-        Return a numpy array containing the image that will be rendered to the screen.
-        This can be different from the state. For example, mujoco's state is a measurements vector.
-        :return: numpy array containing the image that will be rendered to the screen
-        """
-        return self.state['observation']
diff --git a/environments/gym_environment_wrapper.py b/environments/gym_environment_wrapper.py
deleted file mode 100644
index cbd260c..0000000
--- a/environments/gym_environment_wrapper.py
+++ /dev/null
@@ -1,191 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import sys
-from logger import *
-import gym
-import numpy as np
-import time
-import random
-try:
-    import roboschool
-    from OpenGL import GL
-except ImportError:
-    from logger import failed_imports
-    failed_imports.append("RoboSchool")
-
-try:
-    from gym_extensions.continuous import mujoco
-except:
-    from logger import failed_imports
-    failed_imports.append("GymExtensions")
-
-try:
-    import pybullet_envs
-except ImportError:
-    from logger import failed_imports
-    failed_imports.append("PyBullet")
-
-from gym import wrappers
-from utils import force_list, RunPhase
-from environments.environment_wrapper import EnvironmentWrapper
-
-
-class GymEnvironmentWrapper(EnvironmentWrapper):
-    def __init__(self, tuning_parameters):
-        EnvironmentWrapper.__init__(self, tuning_parameters)
-
-        # env parameters
-        if ':' in self.env_id:
-            self.env = gym.envs.registration.load(self.env_id)()
-        else:
-            self.env = gym.make(self.env_id)
-
-        if self.seed is not None:
-            self.env.seed(self.seed)
-
-        # self.env_spec = gym.spec(self.env_id)
-        self.env.frameskip = self.frame_skip
-        self.discrete_controls = type(self.env.action_space) != gym.spaces.box.Box
-        self.random_initialization_steps = 0
-        self.state = self.reset(True)['state']
-
-        # render
-        if self.is_rendered:
-            image = self.get_rendered_image()
-            scale = 1
-            if self.human_control:
-                scale = 2
-            self.renderer.create_screen(image.shape[1]*scale, image.shape[0]*scale)
-
-        if isinstance(self.env.observation_space, gym.spaces.Dict):
-            if 'observation' not in self.env.observation_space:
-                raise ValueError((
-                    'The gym environment provided {env_id} does not contain '
-                    '"observation" in its observation space. For now this is '
-                    'required. The environment does include the following '
-                    'keys in its observation space: {keys}'
-                ).format(
-                    env_id=self.env_id,
-                    keys=self.env.observation_space.keys(),
-                ))
-
-        # TODO: collect and store this as observation space instead
-        self.is_state_type_image = len(self.state['observation'].shape) > 1
-        if self.is_state_type_image:
-            self.width = self.state['observation'].shape[1]
-            self.height = self.state['observation'].shape[0]
-        else:
-            self.width = self.state['observation'].shape[0]
-
-        # action space
-        self.actions_description = {}
-        if hasattr(self.env.unwrapped, 'get_action_meanings'):
-            self.actions_description = self.env.unwrapped.get_action_meanings()
-        if self.discrete_controls:
-            self.action_space_size = self.env.action_space.n
-            self.action_space_abs_range = 0
-        else:
-            self.action_space_size = self.env.action_space.shape[0]
-            self.action_space_high = self.env.action_space.high
-            self.action_space_low = self.env.action_space.low
-            self.action_space_abs_range = np.maximum(np.abs(self.action_space_low), np.abs(self.action_space_high))
-        self.actions = {i: i for i in range(self.action_space_size)}
-        self.key_to_action = {}
-        if hasattr(self.env.unwrapped, 'get_keys_to_action'):
-            self.key_to_action = self.env.unwrapped.get_keys_to_action()
-
-        # measurements
-        if self.env.spec is not None:
-            self.timestep_limit = self.env.spec.timestep_limit
-        else:
-            self.timestep_limit = None
-        self.measurements_size = (len(self.step(0)['info'].keys()),)
-        self.random_initialization_steps = self.tp.env.random_initialization_steps
-
-    def _wrap_state(self, state):
-        if isinstance(self.env.observation_space, gym.spaces.Dict):
-            return state
-        else:
-            return {'observation': state}
-
-    def _update_state(self):
-        if hasattr(self.env, 'env') and hasattr(self.env.env, 'ale'):
-            if self.phase == RunPhase.TRAIN and hasattr(self, 'current_ale_lives'):
-                # signal termination for life loss
-                if self.current_ale_lives != self.env.env.ale.lives():
-                    self.done = True
-            self.current_ale_lives = self.env.env.ale.lives()
-
-    def _take_action(self, action_idx):
-        if action_idx is None:
-            action_idx = self.last_action_idx
-
-        if self.discrete_controls:
-            action = self.actions[action_idx]
-        else:
-            action = action_idx
-
-        # pendulum-v0 for example expects a list
-        if not self.discrete_controls:
-            # catching cases where the action for continuous control is a number instead of a list the
-            # size of the action space
-            if type(action_idx) == int and action_idx == 0:
-                # deal with the "reset" action 0
-                action = [0] * self.env.action_space.shape[0]
-            action = np.array(force_list(action))
-            # removing redundant dimensions such that the action size will match the expected action size from gym
-            if action.shape != self.env.action_space.shape:
-                action = np.squeeze(action)
-            action = np.clip(action, self.action_space_low, self.action_space_high)
-
-        state, self.reward, self.done, self.info = self.env.step(action)
-        self.state = self._wrap_state(state)
-
-    def _preprocess_state(self, state):
-        # TODO: move this into wrapper
-        # crop image for atari games
-        # the image from the environment is 210x160
-        if self.tp.env.crop_observation and hasattr(self.env, 'env') and hasattr(self.env.env, 'ale'):
-            state['observation'] = state['observation'][34:195, :, :]
-        return state
-
-    def _restart_environment_episode(self, force_environment_reset=False):
-        # prevent reset of environment if there are ale lives left
-        if (hasattr(self.env, 'env') and hasattr(self.env.env, 'ale') and self.env.env.ale.lives() > 0) \
-                and not force_environment_reset and not self.env._past_limit():
-            return self.state
-
-        if self.seed:
-            self.env.seed(self.seed)
-
-        self.state = self._wrap_state(self.env.reset())
-
-        # initialize the number of lives
-        if hasattr(self.env, 'env') and hasattr(self.env.env, 'ale'):
-            self.current_ale_lives = self.env.env.ale.lives()
-
-        # simulate a random initial environment state by stepping for a random number of times between 0 and 30
-        step_count = 0
-        random_initialization_steps = random.randint(0, self.random_initialization_steps)
-        while self.state is None or step_count < random_initialization_steps:
-            step_count += 1
-            self.step(0)
-
-        return self.state
-
-    def get_rendered_image(self):
-        return self.env.render(mode='rgb_array')
diff --git a/exploration_policies/__init__.py b/exploration_policies/__init__.py
deleted file mode 100644
index 260d958..0000000
--- a/exploration_policies/__init__.py
+++ /dev/null
@@ -1,28 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from exploration_policies.additive_noise import *
-from exploration_policies.approximated_thompson_sampling_using_dropout import *
-from exploration_policies.bayesian import *
-from exploration_policies.boltzmann import *
-from exploration_policies.bootstrapped import *
-from exploration_policies.categorical import *
-from exploration_policies.continuous_entropy import *
-from exploration_policies.e_greedy import *
-from exploration_policies.exploration_policy import *
-from exploration_policies.greedy import *
-from exploration_policies.ou_process import *
-from exploration_policies.thompson_sampling import *
diff --git a/exploration_policies/additive_noise.py b/exploration_policies/additive_noise.py
deleted file mode 100644
index d8cd7c9..0000000
--- a/exploration_policies/additive_noise.py
+++ /dev/null
@@ -1,46 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import numpy as np
-from exploration_policies.exploration_policy import *
-
-
-class AdditiveNoise(ExplorationPolicy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        ExplorationPolicy.__init__(self, tuning_parameters)
-        self.variance = tuning_parameters.exploration.initial_noise_variance_percentage
-        self.final_variance = tuning_parameters.exploration.final_noise_variance_percentage
-        self.decay_steps = tuning_parameters.exploration.noise_variance_decay_steps
-        self.variance_decay_delta = (self.variance - self.final_variance) / float(self.decay_steps)
-
-    def decay_exploration(self):
-        if self.variance > self.final_variance:
-            self.variance -= self.variance_decay_delta
-        elif self.variance < self.final_variance:
-            self.variance = self.final_variance
-
-    def get_action(self, action_values):
-        if self.phase == RunPhase.TRAIN:
-            self.decay_exploration()
-        action = np.random.normal(action_values, 2 * self.variance * self.action_abs_range)
-        return action #np.clip(action, -self.action_abs_range, self.action_abs_range).squeeze()
-
-    def get_control_param(self):
-        return self.variance
diff --git a/exploration_policies/approximated_thompson_sampling_using_dropout.py b/exploration_policies/approximated_thompson_sampling_using_dropout.py
deleted file mode 100644
index 2a51ae2..0000000
--- a/exploration_policies/approximated_thompson_sampling_using_dropout.py
+++ /dev/null
@@ -1,40 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from exploration_policies.exploration_policy import *
-
-
-class ApproximatedThompsonSamplingUsingDropout(ExplorationPolicy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        ExplorationPolicy.__init__(self, tuning_parameters)
-        self.dropout_discard_probability = tuning_parameters.exploration.dropout_discard_probability
-        self.network = tuning_parameters.network
-        self.assign_op = self.network.dropout_discard_probability.assign(self.dropout_discard_probability)
-        self.network.sess.run(self.assign_op)
-        pass
-
-    def decay_dropout(self):
-        pass
-
-    def get_action(self, action_values):
-        return np.argmax(action_values)
-
-    def get_control_param(self):
-        return self.dropout_discard_probability
diff --git a/exploration_policies/bayesian.py b/exploration_policies/bayesian.py
deleted file mode 100644
index f5ab6b3..0000000
--- a/exploration_policies/bayesian.py
+++ /dev/null
@@ -1,56 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from exploration_policies.exploration_policy import *
-import tensorflow as tf
-
-
-class Bayesian(ExplorationPolicy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        ExplorationPolicy.__init__(self, tuning_parameters)
-        self.keep_probability = tuning_parameters.exploration.initial_keep_probability
-        self.final_keep_probability = tuning_parameters.exploration.final_keep_probability
-        self.keep_probability_decay_delta = (
-                                            tuning_parameters.exploration.initial_keep_probability - tuning_parameters.exploration.final_keep_probability) \
-                                            / float(tuning_parameters.exploration.keep_probability_decay_steps)
-        self.action_space_size = tuning_parameters.env.action_space_size
-        self.network = tuning_parameters.network
-        self.epsilon = 0
-
-    def decay_keep_probability(self):
-        if (self.keep_probability > self.final_keep_probability and self.keep_probability_decay_delta > 0) \
-                or (self.keep_probability < self.final_keep_probability and self.keep_probability_decay_delta < 0):
-            self.keep_probability -= self.keep_probability_decay_delta
-
-    def get_action(self, action_values):
-        if self.phase == RunPhase.TRAIN:
-            self.decay_keep_probability()
-        # dropout = self.network.get_layer('variable_dropout_1')
-        # with tf.Session() as sess:
-        #     print(dropout.rate.eval())
-        # set_value(dropout.rate, 1-self.keep_probability)
-
-        print(self.keep_probability)
-        self.network.curr_keep_prob = self.keep_probability
-
-        return np.argmax(action_values)
-
-    def get_control_param(self):
-        return self.keep_probability
diff --git a/exploration_policies/boltzmann.py b/exploration_policies/boltzmann.py
deleted file mode 100644
index de954be..0000000
--- a/exploration_policies/boltzmann.py
+++ /dev/null
@@ -1,48 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from exploration_policies.exploration_policy import *
-
-
-class Boltzmann(ExplorationPolicy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        ExplorationPolicy.__init__(self, tuning_parameters)
-        self.temperature = tuning_parameters.exploration.initial_temperature
-        self.final_temperature = tuning_parameters.exploration.final_temperature
-        self.temperature_decay_delta = (
-                                       tuning_parameters.exploration.initial_temperature - tuning_parameters.exploration.final_temperature) \
-                                       / float(tuning_parameters.exploration.temperature_decay_steps)
-
-    def decay_temperature(self):
-        if self.temperature > self.final_temperature:
-            self.temperature -= self.temperature_decay_delta
-
-    def get_action(self, action_values):
-        if self.phase == RunPhase.TRAIN:
-            self.decay_temperature()
-        # softmax calculation
-        exp_probabilities = np.exp(action_values / self.temperature)
-        probabilities = exp_probabilities / np.sum(exp_probabilities)
-        probabilities[-1] = 1 - np.sum(probabilities[:-1])  # make sure probs sum to 1
-        # choose actions according to the probabilities
-        return np.random.choice(range(self.action_space_size), p=probabilities)
-
-    def get_control_param(self):
-        return self.temperature
diff --git a/exploration_policies/bootstrapped.py b/exploration_policies/bootstrapped.py
deleted file mode 100644
index 14b6d2c..0000000
--- a/exploration_policies/bootstrapped.py
+++ /dev/null
@@ -1,37 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from exploration_policies.e_greedy import *
-
-
-class Bootstrapped(EGreedy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running parameters
-        :type tuning_parameters: Preset
-        """
-        EGreedy.__init__(self, tuning_parameters)
-        self.num_heads = tuning_parameters.exploration.architecture_num_q_heads
-        self.selected_head = 0
-
-    def select_head(self):
-        self.selected_head = np.random.randint(self.num_heads)
-
-    def get_action(self, action_values):
-        return EGreedy.get_action(self, action_values[self.selected_head])
-
-    def get_control_param(self):
-        return self.selected_head
diff --git a/exploration_policies/categorical.py b/exploration_policies/categorical.py
deleted file mode 100644
index 1be5509..0000000
--- a/exploration_policies/categorical.py
+++ /dev/null
@@ -1,33 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from exploration_policies.exploration_policy import *
-
-
-class Categorical(ExplorationPolicy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        ExplorationPolicy.__init__(self, tuning_parameters)
-
-    def get_action(self, action_values):
-        # choose actions according to the probabilities
-        return np.random.choice(range(self.action_space_size), p=action_values)
-
-    def get_control_param(self):
-        return 0
diff --git a/exploration_policies/e_greedy.py b/exploration_policies/e_greedy.py
deleted file mode 100644
index f0cb01f..0000000
--- a/exploration_policies/e_greedy.py
+++ /dev/null
@@ -1,70 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from exploration_policies.exploration_policy import *
-
-
-class EGreedy(ExplorationPolicy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        ExplorationPolicy.__init__(self, tuning_parameters)
-        self.epsilon = tuning_parameters.exploration.initial_epsilon
-        self.final_epsilon = tuning_parameters.exploration.final_epsilon
-        self.epsilon_decay_delta = (
-                                   tuning_parameters.exploration.initial_epsilon - tuning_parameters.exploration.final_epsilon) \
-                                   / float(tuning_parameters.exploration.epsilon_decay_steps)
-        self.evaluation_epsilon = tuning_parameters.exploration.evaluation_epsilon
-
-        # for continuous e-greedy (see http://www.cs.ubc.ca/~van/papers/2017-TOG-deepLoco/2017-TOG-deepLoco.pdf)
-        self.variance = tuning_parameters.exploration.initial_noise_variance_percentage
-        self.final_variance = tuning_parameters.exploration.final_noise_variance_percentage
-        self.decay_steps = tuning_parameters.exploration.noise_variance_decay_steps
-        self.variance_decay_delta = (self.variance - self.final_variance) / float(self.decay_steps)
-
-    def decay_exploration(self):
-        # decay epsilon
-        if self.epsilon > self.final_epsilon:
-            self.epsilon -= self.epsilon_decay_delta
-        elif self.epsilon < self.final_epsilon:
-            self.epsilon = self.final_epsilon
-
-        # decay noise variance
-        if not self.discrete_controls:
-            if self.variance > self.final_variance:
-                self.variance -= self.variance_decay_delta
-            elif self.variance < self.final_variance:
-                self.variance = self.final_variance
-
-    def get_action(self, action_values):
-        if self.phase == RunPhase.TRAIN:
-            self.decay_exploration()
-        epsilon = self.evaluation_epsilon if self.phase == RunPhase.TEST else self.epsilon
-
-        if self.discrete_controls:
-            top_action = np.argmax(action_values)
-            if np.random.rand() < epsilon:
-                return np.random.randint(self.action_space_size)
-            else:
-                return top_action
-        else:
-            noise = np.random.randn(1, self.action_space_size) * self.variance * self.action_abs_range
-            return np.squeeze(action_values + (np.random.rand() < epsilon) * noise)
-
-    def get_control_param(self):
-        return self.evaluation_epsilon if self.phase == RunPhase.TEST else self.epsilon
diff --git a/exploration_policies/ou_process.py b/exploration_policies/ou_process.py
deleted file mode 100644
index c7d5851..0000000
--- a/exploration_policies/ou_process.py
+++ /dev/null
@@ -1,52 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import numpy as np
-from exploration_policies.exploration_policy import *
-
-# Based on on the description in:
-# https://math.stackexchange.com/questions/1287634/implementing-ornstein-uhlenbeck-in-matlab
-
-# Ornstein-Uhlenbeck process
-class OUProcess(ExplorationPolicy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        ExplorationPolicy.__init__(self, tuning_parameters)
-        self.action_space_size = tuning_parameters.env.action_space_size
-        self.mu = float(tuning_parameters.exploration.mu) * np.ones(self.action_space_size)
-        self.theta = tuning_parameters.exploration.theta
-        self.sigma = float(tuning_parameters.exploration.sigma) * np.ones(self.action_space_size)
-        self.state = np.zeros(self.action_space_size)
-        self.dt = tuning_parameters.exploration.dt
-
-    def reset(self):
-        self.state = np.zeros(self.action_space_size)
-
-    def noise(self):
-        x = self.state
-        dx = self.theta * (self.mu - x) * self.dt + self.sigma * np.random.randn(len(x)) * np.sqrt(self.dt)
-        self.state = x + dx
-        return self.state[0]
-
-    def get_action(self, action_values):
-        noise = self.noise()
-        return action_values.squeeze() + noise
-
-    def get_control_param(self):
-        return self.state[0]
diff --git a/exploration_policies/thompson_sampling.py b/exploration_policies/thompson_sampling.py
deleted file mode 100644
index 265ce60..0000000
--- a/exploration_policies/thompson_sampling.py
+++ /dev/null
@@ -1,35 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from exploration_policies.exploration_policy import *
-
-
-class ThompsonSampling(ExplorationPolicy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        ExplorationPolicy.__init__(self, tuning_parameters)
-        self.action_space_size = tuning_parameters.env.action_space_size
-
-    def get_action(self, action_values):
-        q_values, values_uncertainty = action_values
-        sampled_q_values = np.random.normal(q_values, abs(values_uncertainty))
-        return np.argmax(sampled_q_values)
-
-    def get_control_param(self):
-        return 0
diff --git a/img/coach_logo.png b/img/coach_logo.png
new file mode 100644
index 0000000..a09f2df
Binary files /dev/null and b/img/coach_logo.png differ
diff --git a/img/dashboard.gif b/img/dashboard.gif
new file mode 100644
index 0000000..0337148
Binary files /dev/null and b/img/dashboard.gif differ
diff --git a/install.sh b/install.sh
deleted file mode 100755
index d54af42..0000000
--- a/install.sh
+++ /dev/null
@@ -1,205 +0,0 @@
-#!/bin/bash -e
-
-prompt () {
-    # prints a yes / no question to the user and returns the answer
-    # first argument is the prompt question
-    # second argument is the default answer - Y / N
-    local default_answer
-
-    # set the default value
-    case "${2}" in
-        y|Y ) default_answer=1; options="[Y/n]";;
-        n|N ) default_answer=0; options="[y/N]";;
-        "" ) default_answer=; options="[y/n]";;
-        * ) echo "invalid default value"; exit;;
-    esac
-
-    while true; do
-        # read the user choice
-        read -p "${1} ${options} " choice
-
-        # return the choice or the default value if an enter was pressed
-        case "${choice}" in
-            y|Y ) retval=1; return;;
-            n|N ) retval=0; return;;
-            "" ) if [ ! -z "${default_answer}" ]; then retval=${default_answer}; return; fi;;
-        esac
-    done
-}
-
-add_to_bashrc () {
-    # adds an env variable to the bashrc
-    # first argument is the variable name
-    # second argument is the variable value
-
-    EXISTS_IN_BASHRC=`awk  '/${2}/{print $1}' ~/.bashrc`
-    if [ "${EXISTS_IN_BASHRC}" == "" ]; then
-        echo "export ${1}=${2}" >> ~/.bashrc
-    fi
-}
-
-GET_PREFERENCES_MANUALLY=1
-
-INSTALL_COACH=0
-INSTALL_DASHBOARD=0
-INSTALL_GYM=0
-INSTALL_NEON=0
-INSTALL_VIRTUAL_ENVIRONMENT=1
-
-# Get user preferences
-TEMP=`getopt -o cpgvrmeNndh \
-             --long coach,dashboard,gym,no_virtual_environment,neon,debug,help \
-             -- "$@"`
-eval set -- "$TEMP"
-while true; do
-#for i in "$@"
-    case ${1} in
-        -c|--coach)
-            INSTALL_COACH=1
-            GET_PREFERENCES_MANUALLY=0
-            shift;;
-        -p|--dashboard)
-            INSTALL_DASHBOARD=1
-            GET_PREFERENCES_MANUALLY=0;
-            shift;;
-        -g|--gym)
-            INSTALL_GYM=1
-            GET_PREFERENCES_MANUALLY=0;
-            shift;;
-        -N|--no_virtual_environment)
-            INSTALL_VIRTUAL_ENVIRONMENT=0
-            GET_PREFERENCES_MANUALLY=0;
-            shift;;
-        -ne|--neon)
-            INSTALL_NEON=1
-            GET_PREFERENCES_MANUALLY=0;
-            shift;;
-        -d|--debug) set -x; shift;;
-        -h|--help)
-            echo "Available command line arguments:"
-            echo ""
-            echo "   -c | --coach                  - Install Coach requirements"
-            echo "   -p | --dashboard              - Install Dashboard requirements"
-            echo "   -g | --gym                    - Install Gym support"
-            echo "   -N | --no_virtual_environment - Do not install inside of a virtual environment"
-            echo "   -d | --debug                  - Run in debug mode"
-            echo "   -h | --help                   - Display this help message"
-            echo ""
-            exit;;
-        --) shift; break;;
-        *) break;; # unknown option;;
-    esac
-done
-
-if [ ${GET_PREFERENCES_MANUALLY} -eq 1 ]; then
-    prompt "Install Coach requirements?" Y
-    INSTALL_COACH=${retval}
-
-    prompt "Install Dashboard requirements?" Y
-    INSTALL_DASHBOARD=${retval}
-
-    prompt "Install Gym support?" Y
-    INSTALL_GYM=${retval}
-
-    prompt "Install neon support?" Y
-    INSTALL_NEON=${retval}
-fi
-
-IN_VIRTUAL_ENV=`python3 -c 'import sys; print("%i" % hasattr(sys, "real_prefix"))'`
-
-# basic installations
-sudo -E apt-get install python3-pip cmake zlib1g-dev python3-tk python-opencv -y
-#pip3 install --upgrade pip
-
-# if we are not in a virtual environment, we will create one with the appropriate python version and then activate it
-# if we are already in a virtual environment,
-
-if [ ${INSTALL_VIRTUAL_ENVIRONMENT} -eq 1 ]; then
-    if [ ${IN_VIRTUAL_ENV} -eq 0 ]; then
-        sudo -E pip3 install virtualenv
-        virtualenv -p python3 coach_env
-        . coach_env/bin/activate
-    fi
-fi
-
-#------------------------------------------------
-# From now on we are in a virtual environment
-#------------------------------------------------
-
-# get python local and global paths
-python_version=python$(python -c "import sys; print (str(sys.version_info[0])+'.'+str(sys.version_info[1]))")
-var=( $(which -a $python_version) )
-get_python_lib_cmd="from distutils.sysconfig import get_python_lib; print (get_python_lib())"
-lib_virtualenv_path=$(python -c "$get_python_lib_cmd")
-lib_system_path=$(${var[-1]} -c "$get_python_lib_cmd")
-
-# Boost libraries
-sudo -E apt-get install libboost-all-dev -y
-
-# Coach
-if [ ${INSTALL_COACH} -eq 1 ]; then
-    echo "Installing Coach requirements"
-    pip3 install -r ./requirements_coach.txt
-fi
-
-# Dashboard
-if [ ${INSTALL_DASHBOARD} -eq 1 ]; then
-    echo "Installing Dashboard requirements"
-    pip3 install -r ./requirements_dashboard.txt
-    sudo -E apt-get install dpkg-dev build-essential python3.5-dev libjpeg-dev  libtiff-dev libsdl1.2-dev libnotify-dev \
-    freeglut3 freeglut3-dev libsm-dev libgtk2.0-dev libgtk-3-dev libwebkitgtk-dev libgtk-3-dev libwebkitgtk-3.0-dev libgstreamer-plugins-base1.0-dev -y
-
-    sudo -E -H pip3 install -U --pre -f \
-    https://wxpython.org/Phoenix/snapshot-builds/linux/gtk3/ubuntu-16.04/wxPython-4.0.0a3.dev3059+4a5c5d9-cp35-cp35m-linux_x86_64.whl  wxPython
-
-    # link wxPython Phoenix library into the virtualenv since it is installed with apt-get and not accessible
-    libs=( wx )
-    for lib in ${libs[@]}
-    do
-        ln -sf $lib_system_path/$lib $lib_virtualenv_path/$lib
-    done
-fi
-
-# Gym
-if [ ${INSTALL_GYM} -eq 1 ]; then
-    echo "Installing Gym support"
-    sudo -E apt-get install libav-tools libsdl2-dev swig cmake -y
-    pip3 install box2d # for bipedal walker etc.
-    pip3 install gym[all]==0.9.4
-fi
-
-# NGraph and Neon
-if [ ${INSTALL_NEON} -eq 1 ]; then
-    echo "Installing neon requirements"
-
-    # MKL
-    git clone https://github.com/01org/mkl-dnn.git
-    cd mkl-dnn
-    cd scripts && ./prepare_mkl.sh && cd ..
-    mkdir -p build && cd build && cmake .. && make -j
-    sudo make install -j
-    cd ../..
-    export MKLDNN_ROOT=/usr/local/
-    add_to_bashrc MKLDNN_ROOT ${MKLDNN_ROOT}
-    export LD_LIBRARY_PATH=$MKLDNN_ROOT/lib:$LD_LIBRARY_PATH
-    add_to_bashrc LD_LIBRARY_PATH ${MKLDNN_ROOT}/lib:$LD_LIBRARY_PATH
-
-    # NGraph
-    git clone https://github.com/NervanaSystems/ngraph.git
-    cd ngraph
-    make install -j
-    cd ..
-
-    # Neon
-    sudo -E apt-get install libhdf5-dev libyaml-dev pkg-config clang virtualenv libcurl4-openssl-dev libopencv-dev libsox-dev -y
-    pip3 install nervananeon
-fi
-
-if ! [ -x "$(command -v nvidia-smi)" ]; then
-    # Intel Optimized TensorFlow
-    #pip3 install https://anaconda.org/intel/tensorflow/1.3.0/download/tensorflow-1.3.0-cp35-cp35m-linux_x86_64.whl
-    pip3 install https://anaconda.org/intel/tensorflow/1.4.0/download/tensorflow-1.4.0-cp35-cp35m-linux_x86_64.whl
-else
-    # GPU supported TensorFlow
-    pip3 install tensorflow-gpu==1.4.1
-fi
diff --git a/logger.py b/logger.py
deleted file mode 100644
index c070cc0..0000000
--- a/logger.py
+++ /dev/null
@@ -1,320 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from pandas import *
-import os
-import re
-from pprint import pprint
-import threading
-from subprocess import Popen, PIPE
-import time
-import datetime
-from six.moves import input
-from PIL import Image
-from typing import Union
-import shutil
-
-global failed_imports
-failed_imports = []
-
-
-class Colors(object):
-    PURPLE = '\033[95m'
-    CYAN = '\033[96m'
-    DARKCYAN = '\033[36m'
-    BLUE = '\033[94m'
-    GREEN = '\033[92m'
-    YELLOW = '\033[93m'
-    RED = '\033[91m'
-    WHITE = '\033[37m'
-    BG_RED = '\033[41m'
-    BG_GREEN = '\033[42m'
-    BG_YELLOW = '\033[43m'
-    BG_BLUE = '\033[44m'
-    BG_PURPLE = '\033[45m'
-    BG_CYAN = '\033[30;46m'
-    BG_WHITE = '\x1b[30;47m'
-    BG_RESET = '\033[49m'
-    BOLD = '\033[1m'
-    UNDERLINE_ON = '\033[4m'
-    UNDERLINE_OFF = '\033[24m'
-    END = '\033[0m'
-
-
-# prints to screen with a prefix identifying the origin of the print
-class ScreenLogger(object):
-    def __init__(self, name):
-        self.name = name
-
-    def separator(self):
-        print("")
-        print("--------------------------------")
-        print("")
-
-    def log(self, data):
-        print(data)
-
-    def log_dict(self, dict, prefix=""):
-        str = "{}{}{} - ".format(Colors.PURPLE, prefix, Colors.END)
-        for k, v in dict.items():
-            str += "{}{}: {}{} ".format(Colors.BLUE, k, Colors.END, v)
-        print(str)
-
-    def log_title(self, title):
-        print("{}{}{}".format(Colors.BG_CYAN, title, Colors.END))
-
-    def success(self, text):
-        print("{}{}{}".format(Colors.GREEN, text, Colors.END))
-
-    def warning(self, text):
-        print("{}{}{}".format(Colors.YELLOW, text, Colors.END))
-
-    def error(self, text, crash=True):
-        print("{}{}{}".format(Colors.RED, text, Colors.END))
-        if crash:
-            exit(1)
-
-    def ask_input(self, title):
-        return input("{}{}{}".format(Colors.BG_CYAN, title, Colors.END))
-
-    def ask_yes_no(self, title: str, default: Union[None, bool]=None):
-        """
-        Ask the user for a yes / no question and return True if the answer is yes and False otherwise.
-        The function will keep asking the user for an answer until he answers one of the possible responses.
-        A default answer can be passed and will be selected if the user presses enter
-        :param title: The question to ask the user
-        :param default: the default answer
-        :return: True / False according to the users answer
-        """
-        default_answer = 'y/n'
-        if default == True:
-            default_answer = 'Y/n'
-        elif default == False:
-            default_answer = 'y/N'
-
-        while True:
-            answer = input("{}{}{} ({})".format(Colors.BG_CYAN, title, Colors.END, default_answer))
-            if answer == "yes" or answer == "YES" or answer == "y" or answer == "Y":
-                return True
-            elif answer == "no" or answer == "NO" or answer == "n" or answer == "N":
-                return False
-            elif answer == "":
-                if default is not None:
-                    return default
-
-    def change_terminal_title(self, title: str):
-        """
-        Changes the title of the terminal window
-        :param title: The new title
-        :return: None
-        """
-        print("\x1b]2;{}\x07".format(title))
-
-
-class BaseLogger(object):
-    def __init__(self):
-        pass
-
-    def set_current_time(self, time):
-        pass
-
-    def set_dump_dir(self, path, task_id):
-        pass
-
-    def create_signal_value(self, signal_name, value, overwrite=True, time=None):
-        pass
-
-    def change_signal_value(self, signal_name, time, value):
-        pass
-
-    def signal_value_exists(self, time, signal_name):
-        pass
-
-    def get_signal_value(self, time, signal_name):
-        pass
-
-    def dump_output_csv(self):
-        pass
-
-    def update_wall_clock_time(self, episode):
-        pass
-
-
-class Logger(BaseLogger):
-    def __init__(self):
-        BaseLogger.__init__(self)
-        self.data = DataFrame()
-        self.csv_path = ''
-        self.doc_path = ''
-        self.aggregated_data_across_threads = None
-        self.time_started = datetime.datetime.now()
-        self.start_time = None
-        self.time = None
-        self.experiments_path = ""
-        self.last_line_idx_written_to_csv = 0
-        self.experiment_name = ""
-
-    def set_current_time(self, time):
-        self.time = time
-
-    def two_digits(self, num):
-        return '%02d' % num
-
-    def set_dump_dir(self, experiments_path, task_id=None, add_timestamp=False, filename='worker'):
-        self.experiments_path = experiments_path
-
-        # set file names
-        if task_id is not None:
-            filename += "_{}".format(task_id)
-
-        # add timestamp
-        if add_timestamp:
-            t = self.time_started.time()
-            d = self.time_started.date()
-            filename += '_{}_{}_{}-{}_{}'.format(self.two_digits(d.day), self.two_digits(d.month),
-                                                 d.year, self.two_digits(t.hour), self.two_digits(t.minute))
-
-        # add an index to the file in case there is already an experiment running with the same timestamp
-        path_exists = True
-        idx = 0
-        while path_exists:
-            self.csv_path = os.path.join(experiments_path, '{}_{}.csv'.format(filename, idx))
-            self.doc_path = os.path.join(experiments_path, '{}_{}.doc'.format(filename, idx))
-            path_exists = os.path.exists(self.csv_path) or os.path.exists(self.doc_path)
-            idx += 1
-
-    def create_signal_value(self, signal_name, value, overwrite=True, time=None):
-        if not time:
-            time = self.time
-        # create only if it doesn't already exist
-        if overwrite or not self.signal_value_exists(time, signal_name):
-            self.data.loc[time, signal_name] = value
-            return True
-        return False
-
-    def change_signal_value(self, signal_name, time, value):
-        # change only if it already exists
-        if self.signal_value_exists(time, signal_name):
-            self.data.loc[time, signal_name] = value
-            return True
-        return False
-
-    def signal_value_exists(self, time, signal_name):
-        try:
-            value = self.get_signal_value(time, signal_name)
-            if value != value:  # value is nan
-                return False
-        except:
-            return False
-        return True
-
-    def get_signal_value(self, time, signal_name):
-        return self.data.loc[time, signal_name]
-
-    def dump_output_csv(self, append=True):
-        self.data.index.name = "Episode #"
-        if len(self.data.index) == 1:
-            self.start_time = time.time()
-
-        if os.path.exists(self.csv_path) and append:
-            self.data[self.last_line_idx_written_to_csv:].to_csv(self.csv_path, mode='a', header=False)
-        else:
-            self.data.to_csv(self.csv_path)
-
-        self.last_line_idx_written_to_csv = len(self.data.index)
-
-    def update_wall_clock_time(self, episode):
-        if self.start_time:
-            self.create_signal_value('Wall-Clock Time', time.time() - self.start_time, time=episode)
-        else:
-            self.create_signal_value('Wall-Clock Time', 0, time=episode)
-            self.start_time = time.time()
-
-    def create_gif(self, images, fps=10, name="Gif"):
-        output_file = '{}_{}.gif'.format(datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S'), name)
-        output_dir = os.path.join(self.experiments_path, 'gifs')
-        if not os.path.exists(output_dir):
-            os.makedirs(output_dir)
-        output_path = os.path.join(output_dir, output_file)
-        pil_images = [Image.fromarray(image) for image in images]
-        pil_images[0].save(output_path, save_all=True, append_images=pil_images[1:], duration=1.0 / fps, loop=0)
-
-    def remove_experiment_dir(self):
-        shutil.rmtree(self.experiments_path)
-
-    def summarize_experiment(self):
-        screen.separator()
-        screen.log_title("Results stored at: {}".format(self.experiments_path))
-        screen.log_title("Total runtime: {}".format(datetime.datetime.now() - self.time_started))
-        if 'Training Reward' in self.data.keys() and 'Evaluation Reward' in self.data.keys():
-            screen.log_title("Max training reward: {}, max evaluation reward: {}".format(self.data['Training Reward'].max(), self.data['Evaluation Reward'].max()))
-        screen.separator()
-        if screen.ask_yes_no("Do you want to discard the experiment results (Warning: this cannot be undone)?", False):
-            self.remove_experiment_dir()
-        elif screen.ask_yes_no("Do you want to specify a different experiment name to save to?", False):
-            new_name = self.get_experiment_name()
-            new_path = self.get_experiment_path(new_name, create_path=False)
-            shutil.move(self.experiments_path, new_path)
-            screen.log_title("Results moved to: {}".format(new_path))
-
-    def get_experiment_name(self, initial_experiment_name=''):
-        match = None
-        while match is None:
-            if initial_experiment_name == '':
-                experiment_name = screen.ask_input("Please enter an experiment name: ")
-            else:
-                experiment_name = initial_experiment_name
-
-            experiment_name = experiment_name.replace(" ", "_")
-            match = re.match("^$|^[\w -/]{1,100}$", experiment_name)
-
-            if match is None:
-                screen.error('Experiment name must be composed only of alphanumeric letters, '
-                             'underscores and dashes and should not be longer than 100 characters.')
-
-        self.experiment_name = match.group(0)
-        return self.experiment_name
-
-    def get_experiment_path(self, experiment_name, create_path=True):
-        general_experiments_path = os.path.join('./experiments/', experiment_name)
-
-        cur_date = self.time_started.date()
-        cur_time = self.time_started.time()
-
-        if not os.path.exists(general_experiments_path) and create_path:
-            os.makedirs(general_experiments_path)
-        experiment_path = os.path.join(general_experiments_path, '{}_{}_{}-{}_{}'
-                                       .format(logger.two_digits(cur_date.day), logger.two_digits(cur_date.month),
-                                               cur_date.year, logger.two_digits(cur_time.hour),
-                                               logger.two_digits(cur_time.minute)))
-        i = 0
-        while True:
-            if os.path.exists(experiment_path):
-                experiment_path = os.path.join(general_experiments_path, '{}_{}_{}-{}_{}_{}'
-                                               .format(cur_date.day, cur_date.month, cur_date.year, cur_time.hour,
-                                                       cur_time.minute, i))
-                i += 1
-            else:
-                if create_path:
-                    os.makedirs(experiment_path)
-                return experiment_path
-
-
-global logger
-logger = Logger()
-
-global screen
-screen = ScreenLogger("")
diff --git a/memories/episodic_experience_replay.py b/memories/episodic_experience_replay.py
deleted file mode 100644
index 5930d78..0000000
--- a/memories/episodic_experience_replay.py
+++ /dev/null
@@ -1,176 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from memories.memory import *
-import threading
-from typing import Union
-
-
-class EpisodicExperienceReplay(Memory):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        Memory.__init__(self, tuning_parameters)
-        self.tp = tuning_parameters
-        self.max_size_in_episodes = tuning_parameters.agent.num_episodes_in_experience_replay
-        self.max_size_in_transitions = tuning_parameters.agent.num_transitions_in_experience_replay
-        self.discount = tuning_parameters.agent.discount
-        self.buffer = [Episode()]  # list of episodes
-        self.transitions = []
-        self._length = 1
-        self._num_transitions = 0
-        self._num_transitions_in_complete_episodes = 0
-        self.return_is_bootstrapped = tuning_parameters.agent.bootstrap_total_return_from_old_policy
-
-    def length(self):
-        """ Get the number of episodes in the ER (even if they are not complete) """
-        if self._length is not 0 and self.buffer[-1].is_empty():
-            return self._length - 1
-        return self._length
-
-    def num_complete_episodes(self):
-        """ Get the number of complete episodes in ER """
-        return self._length - 1
-
-    def num_transitions(self):
-        return self._num_transitions
-
-    def num_transitions_in_complete_episodes(self):
-        return self._num_transitions_in_complete_episodes
-
-    def sample_episode(self):
-        episode_idx = np.random.randint(self.num_complete_episodes())
-        return self.buffer[episode_idx]
-
-    def sample_n_episodes(self, n):
-        num_n_episodes = (self.num_complete_episodes()) / n
-        assert num_n_episodes > 0, \
-            'Tried sampling {} episodes when only {} completed episodes are available in the memory' \
-                .format(n, self.num_complete_episodes())
-        start_episode_idx = np.random.randint(num_n_episodes) * n
-        return self.buffer[start_episode_idx:(start_episode_idx + n)]
-
-    def sample_last_n_episodes(self, n):
-        num_n_episodes = (self.num_complete_episodes()) / n
-        assert num_n_episodes > 0, \
-            'Tried sampling {} episodes when only {} completed episodes are available in the memory' \
-                .format(n, self.num_complete_episodes())
-        start_episode_idx = -n
-        return self.buffer[start_episode_idx:(start_episode_idx + n)]
-
-    def sample(self, size):
-        assert self.num_transitions_in_complete_episodes() > size, \
-            'There are not enough transitions in the replay buffer. ' \
-            'Available transitions: {}. Requested transitions: {}.'\
-                .format(self.num_transitions_in_complete_episodes(), size)
-        batch = []
-        transitions_idx = np.random.randint(self.num_transitions_in_complete_episodes(), size=size)
-        for i in transitions_idx:
-            batch.append(self.transitions[i])
-
-        return batch
-
-    def enforce_length(self):
-        # clean up if necessary
-        if self.max_size_in_transitions is not None:
-            while self.max_size_in_transitions != 0 and self.num_transitions() > self.max_size_in_transitions:
-                self.remove_episode(0)
-        else:
-            while self.length() > self.max_size_in_episodes:
-                self.remove_episode(0)
-
-    def store(self, transition):
-        if len(self.buffer) == 0:
-            self.buffer.append(Episode())
-        last_episode = self.buffer[-1]
-        last_episode.insert(transition)
-        self.transitions.append(transition)
-        self._num_transitions += 1
-        if transition.game_over:
-            self._num_transitions_in_complete_episodes += last_episode.length()
-            self._length += 1
-            self.buffer[-1].update_returns(self.discount,
-                                           is_bootstrapped=self.tp.agent.bootstrap_total_return_from_old_policy,
-                                           n_step_return=self.tp.agent.n_step)
-            self.buffer[-1].update_measurements_targets(self.tp.agent.num_predicted_steps_ahead)
-            # self.buffer[-1].update_actions_probabilities() # used for off-policy policy optimization
-            self.buffer.append(Episode())
-
-        self.enforce_length()
-
-    def insert_full_episode(self, episode):
-        # Do not add a new episode if the last one is not closed yet
-        if self.buffer[-1].get_last_transition().done != True:
-            return False
-
-        episode.update_returns(self.discount)
-        episode.update_measurements_targets(self.tp.agent.num_predicted_steps_ahead)
-        self.buffer.append(episode)
-        self.transitions += episode.transitions
-        self._length += 1
-        self._num_transitions += episode.length()
-
-        self.enforce_length()
-
-        return True
-
-    def get_episode(self, episode_index):
-        if self.length() == 0:
-            return None
-        episode = self.buffer[episode_index]
-        return episode
-
-    def remove_episode(self, episode_index):
-        if len(self.buffer) > episode_index:
-            episode_length = self.buffer[episode_index].length()
-            self._length -= 1
-            self._num_transitions -= episode_length
-            self._num_transitions_in_complete_episodes -= episode_length
-            del self.transitions[:episode_length]
-            del self.buffer[episode_index]
-
-    # for API compatibility
-    def get(self, index):
-        return self.get_episode(index)
-
-    def get_last_complete_episode(self) -> Union[None, Episode]:
-        """
-        Returns the last complete episode in the memory or None if there are no complete episodes
-        :return: None or the last complete episode
-        """
-        last_complete_episode_index = self.num_complete_episodes()-1
-        if last_complete_episode_index >= 0:
-            return self.get(last_complete_episode_index)
-        else:
-            return None
-
-    def update_last_transition_info(self, info):
-        episode = self.buffer[-1]
-        if episode.length() == 0:
-            if len(self.buffer) < 2:
-                return
-            episode = self.buffer[-2]
-        for key, val in info.items():
-            episode.transitions[-1].info[key] = val
-
-    def clean(self):
-        self.transitions = []
-        self.buffer = [Episode()]
-        self._length = 1
-        self._num_transitions = 0
-        self._num_transitions_in_complete_episodes = 0
diff --git a/memories/memory.py b/memories/memory.py
deleted file mode 100644
index f4c6a87..0000000
--- a/memories/memory.py
+++ /dev/null
@@ -1,161 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import numpy as np
-import copy
-from configurations import *
-
-
-class Memory(object):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        pass
-
-    def store(self, obj):
-        pass
-
-    def get(self, index):
-        pass
-
-    def length(self):
-        pass
-
-    def sample(self, size):
-        pass
-
-    def clean(self):
-        pass
-
-
-class Episode(object):
-    def __init__(self):
-        self.transitions = []
-        # a num_transitions x num_transitions table with the n step return in the n'th row
-        self.returns_table = None
-        self._length = 0
-
-    def insert(self, transition):
-        self.transitions.append(transition)
-        self._length += 1
-
-    def is_empty(self):
-        return self.length() == 0
-
-    def length(self):
-        return self._length
-
-    def get_transition(self, transition_idx):
-        return self.transitions[transition_idx]
-
-    def get_last_transition(self):
-        return self.get_transition(-1)
-
-    def get_first_transition(self):
-        return self.get_transition(0)
-
-    def update_returns(self, discount, is_bootstrapped=False, n_step_return=-1):
-        if n_step_return == -1 or n_step_return > self.length():
-            n_step_return = self.length()
-        rewards = np.array([t.reward for t in self.transitions])
-        rewards = rewards.astype('float')
-        total_return = rewards.copy()
-        current_discount = discount
-        for i in range(1, n_step_return):
-            total_return += current_discount * np.pad(rewards[i:], (0, i), 'constant', constant_values=0)
-            current_discount *= discount
-
-        # calculate the bootstrapped returns
-        bootstraps = np.array([np.squeeze(t.info['max_action_value']) for t in self.transitions[n_step_return:]])
-        bootstrapped_return = total_return + current_discount * np.pad(bootstraps, (0, n_step_return), 'constant',
-                                                                       constant_values=0)
-        if is_bootstrapped:
-            total_return = bootstrapped_return
-
-        for transition_idx in range(self.length()):
-            self.transitions[transition_idx].total_return = total_return[transition_idx]
-
-    def update_measurements_targets(self, num_steps):
-        if 'measurements' not in self.transitions[0].state:
-            return
-        measurements_size = self.transitions[0].state['measurements'].shape[-1]
-        total_return = sum([transition.reward for transition in self.transitions])
-        for transition_idx, transition in enumerate(self.transitions):
-            transition.info['future_measurements'] = np.zeros((num_steps, measurements_size))
-            for step in range(num_steps):
-                offset_idx = transition_idx + 2 ** step
-                if offset_idx >= self.length():
-                    offset_idx = -1
-                transition.info['future_measurements'][step] = self.transitions[offset_idx].next_state['measurements'] - \
-                                                               transition.state['measurements']
-            transition.info['total_episode_return'] = total_return
-
-    def update_actions_probabilities(self):
-        probability_product = 1
-        for transition_idx, transition in enumerate(self.transitions):
-            if 'action_probabilities' in transition.info.keys():
-                probability_product *= transition.info['action_probabilities']
-        for transition_idx, transition in enumerate(self.transitions):
-            transition.info['probability_product'] = probability_product
-
-    def get_returns_table(self):
-        return self.returns_table
-
-    def get_returns(self):
-        return self.get_transitions_attribute('total_return')
-
-    def get_transitions_attribute(self, attribute_name):
-        if hasattr(self.transitions[0], attribute_name):
-            return [t.__dict__[attribute_name] for t in self.transitions]
-        else:
-            raise ValueError("The transitions have no such attribute name")
-
-    def to_batch(self):
-        batch = []
-        for i in range(self.length()):
-            batch.append(self.get_transition(i))
-        return batch
-
-
-class Transition(object):
-    def __init__(self, state, action, reward=0, next_state=None, game_over=False):
-        """
-        A transition is a tuple containing the information of a single step of interaction
-        between the agent and the environment. The most basic version should contain the following values:
-        (current state, action, reward, next state, game over)
-        For imitation learning algorithms, if the reward, next state or game over is not known,
-        it is sufficient to store the current state and action taken by the expert.
-
-        :param state: The current state. Assumed to be a dictionary where the observation
-                      is located at state['observation']
-        :param action: The current action that was taken
-        :param reward: The reward received from the environment
-        :param next_state: The next state of the environment after applying the action.
-                           The next state should be similar to the state in its structure.
-        :param game_over: A boolean which should be True if the episode terminated after
-                          the execution of the action.
-        """
-        self.state = state
-        self.action = action
-        self.reward = reward
-        self.total_return = None
-        if not next_state:
-            next_state = state
-        self.next_state = next_state
-        self.game_over = game_over
-        self.info = {}
diff --git a/parallel_actor.py b/parallel_actor.py
deleted file mode 100644
index ac51649..0000000
--- a/parallel_actor.py
+++ /dev/null
@@ -1,181 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation 
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import argparse
-import tensorflow as tf
-from architectures import *
-from environments import *
-from agents import *
-from utils import *
-import time
-import copy
-from logger import *
-from configurations import *
-from presets import *
-import shutil
-
-start_time = time.time()
-
-if __name__ == "__main__":
-    parser = argparse.ArgumentParser()
-    parser.add_argument('--ps_hosts',
-                        help="(string) Comma-separated list of hostname:port pairs",
-                        default='',
-                        type=str)
-    parser.add_argument('--worker_hosts',
-                        help="(string) Comma-separated list of hostname:port pairs",
-                        default='',
-                        type=str)
-    parser.add_argument('--job_name',
-                        help="(string) One of 'ps', 'worker'",
-                        default='',
-                        type=str)
-    parser.add_argument('--load_json_path',
-                        help="(string) Path to a JSON file to load.",
-                        default='',
-                        type=str)
-
-    args = parser.parse_args()
-
-    ps_hosts = args.ps_hosts.split(",")
-    worker_hosts = args.worker_hosts.split(",")
-
-    # Create a cluster from the parameter server and worker hosts.
-    cluster = tf.train.ClusterSpec({"ps": ps_hosts, "worker": worker_hosts})
-
-    if args.job_name == "ps":
-        # Create and start a parameter server
-        server = tf.train.Server(cluster,
-                                 job_name="ps",
-                                 task_index=0,
-                                 config=tf.ConfigProto())#device_filters=["/job:ps"]))
-        server.join()
-
-    elif args.job_name == "worker":
-        # get tuning parameters
-        tuning_parameters = json_to_preset(args.load_json_path)
-
-        # dump documentation
-        if not os.path.exists(tuning_parameters.experiment_path):
-            os.makedirs(tuning_parameters.experiment_path)
-        if tuning_parameters.evaluate_only:
-            logger.set_dump_dir(tuning_parameters.experiment_path, tuning_parameters.task_id, filename='evaluator')
-        else:
-            logger.set_dump_dir(tuning_parameters.experiment_path, tuning_parameters.task_id)
-
-        # multi-threading parameters
-        tuning_parameters.start_time = start_time
-
-        # User is allowed to override the number of synchronized threads if he wishes to do so.
-        # Else, just sync over all of them.
-        if not tuning_parameters.synchronize_over_num_threads:
-            tuning_parameters.synchronize_over_num_threads = tuning_parameters.num_threads
-
-        tuning_parameters.distributed = True
-        if tuning_parameters.evaluate_only:
-            tuning_parameters.visualization.dump_signals_to_csv_every_x_episodes = 1
-
-        # Create and start a worker
-        server = tf.train.Server(cluster,
-                                 job_name="worker",
-                                 task_index=tuning_parameters.task_id)
-
-        # Assigns ops to the local worker by default.
-        device = tf.train.replica_device_setter(worker_device="/job:worker/task:%d/cpu:0" % tuning_parameters.task_id,
-                                                cluster=cluster)
-
-        # create the agent and the environment
-        env_instance = create_environment(tuning_parameters)
-        exec('agent = ' + tuning_parameters.agent.type + '(env_instance, tuning_parameters, replicated_device=device, '
-                                                    'thread_id=tuning_parameters.task_id)')
-
-        # building the scaffold
-        # local vars
-        local_variables = []
-        for network in agent.networks:
-            local_variables += network.get_local_variables()
-        local_variables += tf.local_variables()
-
-        # global vars
-        global_variables = []
-        for network in agent.networks:
-            global_variables += network.get_global_variables()
-
-        # out of scope variables - not sure why this variables are created out of scope
-        variables_not_in_scope = [v for v in tf.global_variables() if v not in global_variables and v not in local_variables]
-
-        # init ops
-        global_init_op = tf.variables_initializer(global_variables)
-        local_init_op = tf.variables_initializer(local_variables + variables_not_in_scope)
-        out_of_scope_init_op = tf.variables_initializer(variables_not_in_scope)
-        init_all_op = tf.global_variables_initializer()  # this includes global, local, and out of scope
-        ready_op = tf.report_uninitialized_variables(global_variables + local_variables)
-        ready_for_local_init_op = tf.report_uninitialized_variables([])
-
-        def init_fn(scaffold, session):
-            session.run(init_all_op)
-
-
-        #saver = tf.train.Saver(max_to_keep=None) # uncomment to unlimit number of stored checkpoints
-        scaffold = tf.train.Scaffold(init_op=init_all_op,
-                                     init_fn=init_fn,
-                                     ready_op=ready_op,
-                                     ready_for_local_init_op=ready_for_local_init_op,
-                                     local_init_op=local_init_op)
-                                     #saver=saver) # uncomment to unlimit number of stored checkpoints
-
-        # Due to awkward tensorflow behavior where the same variable is used to decide whether to restore a model
-        # (and where from), or just save the model (and where to), we employ the below. In case where a restore folder
-        # is given, it will also be used as the folder to save new checkpoints of the trained model to. Otherwise the
-        # experiment's folder will be used as the folder to save the trained model to.
-        if tuning_parameters.checkpoint_restore_dir:
-            checkpoint_dir = tuning_parameters.checkpoint_restore_dir
-        elif tuning_parameters.save_model_sec:
-            checkpoint_dir = tuning_parameters.experiment_path
-        else:
-            checkpoint_dir = None
-
-        # Set the session
-        sess = tf.train.MonitoredTrainingSession(
-            server.target,
-            is_chief=tuning_parameters.task_id == 0,
-            scaffold=scaffold,
-            hooks=[],
-            checkpoint_dir=checkpoint_dir,
-            save_checkpoint_secs=tuning_parameters.save_model_sec)
-        tuning_parameters.sess = sess
-        for network in agent.networks:
-            network.set_session(sess)
-            # if hasattr(network.global_network, 'lock_init'):
-            #     sess.run(network.global_network.lock_init)
-            # if hasattr(network.global_network, 'release_init'):
-            #     sess.run(network.global_network.release_init)
-
-        if tuning_parameters.visualization.tensorboard:
-            # Write the merged summaries to the current experiment directory
-            agent.main_network.online_network.train_writer = tf.summary.FileWriter(
-                tuning_parameters.experiment_path + '/tensorboard_worker{}'.format(tuning_parameters.task_id),
-                sess.graph)
-
-        # Start the training or evaluation
-        if tuning_parameters.evaluate_only:
-            agent.evaluate(sys.maxsize, keep_networks_synced=True)  # evaluate forever
-        else:
-            agent.improve()
-    else:
-        screen.error("Invalid mode requested for parallel_actor.")
-        exit(1)
-
diff --git a/presets.py b/presets.py
deleted file mode 100644
index ae73a92..0000000
--- a/presets.py
+++ /dev/null
@@ -1,1427 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from configurations import *
-import ast
-import sys
-
-
-def json_to_preset(json_path):
-    with open(json_path, 'r') as json_file:
-        run_dict = json.loads(json_file.read())
-
-    if run_dict['preset'] is None:
-        tuning_parameters = Preset(eval(run_dict['agent_type']), eval(run_dict['environment_type']),
-                                   eval(run_dict['exploration_policy_type']))
-    else:
-        tuning_parameters = eval(run_dict['preset'])()
-        # Override existing parts of the preset
-        if run_dict['agent_type'] is not None:
-            tuning_parameters.agent = eval(run_dict['agent_type'])()
-
-        if run_dict['environment_type'] is not None:
-            tuning_parameters.env = eval(run_dict['environment_type'])()
-
-        if run_dict['exploration_policy_type'] is not None:
-            tuning_parameters.exploration = eval(run_dict['exploration_policy_type'])()
-
-    # human control
-    if run_dict['play']:
-        tuning_parameters.agent.type = 'HumanAgent'
-        tuning_parameters.env.human_control = True
-        tuning_parameters.num_heatup_steps = 0
-        
-    if run_dict['level']:
-        tuning_parameters.env.level = run_dict['level']
-
-    if run_dict['custom_parameter'] is not None:
-        unstripped_key_value_pairs = [pair.split('=') for pair in run_dict['custom_parameter'].split(';')]
-        stripped_key_value_pairs = [tuple([pair[0].strip(), ast.literal_eval(pair[1].strip())]) for pair in
-                                    unstripped_key_value_pairs]
-
-        # load custom parameters into run_dict
-        for key, value in stripped_key_value_pairs:
-            run_dict[key] = value
-
-    for key in ['agent_type', 'environment_type', 'exploration_policy_type', 'preset', 'custom_parameter']:
-        run_dict.pop(key, None)
-
-    # load parameters from run_dict to tuning_parameters
-    for key, value in run_dict.items():
-        if ((sys.version_info[0] == 2 and type(value) == unicode) or
-                (sys.version_info[0] == 3 and type(value) == str)):
-            value = '"{}"'.format(value)
-        exec('tuning_parameters.{} = {}'.format(key, value)) in globals(), locals()
-
-    return tuning_parameters
-
-
-class Doom_Basic_DQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DQN, Doom, ExplorationParameters)
-        self.env.level = 'basic'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.agent.num_steps_between_copying_online_weights_to_target = 1000
-        self.num_heatup_steps = 1000
-
-
-class Doom_Basic_QRDQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, QuantileRegressionDQN, Doom, ExplorationParameters)
-        self.env.level = 'basic'
-        self.agent.num_steps_between_copying_online_weights_to_target = 1000
-        self.learning_rate = 0.00025
-        self.agent.num_episodes_in_experience_replay = 200
-        self.num_heatup_steps = 1000
-
-
-class Doom_Basic_OneStepQ(Preset):
-    def __init__(self):
-        Preset.__init__(self, NStepQ, Doom, ExplorationParameters)
-        self.env.level = 'basic'
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 0
-        self.agent.num_steps_between_copying_online_weights_to_target = 100
-        self.agent.optimizer_type = 'Adam'
-        self.clip_gradients = 1000
-        self.agent.targets_horizon = '1-Step'
-
-
-class Doom_Basic_NStepQ(Preset):
-    def __init__(self):
-        Preset.__init__(self, NStepQ, Doom, ExplorationParameters)
-        self.env.level = 'basic'
-        self.learning_rate = 0.000025
-        self.num_heatup_steps = 0
-        self.agent.num_steps_between_copying_online_weights_to_target = 1000
-        self.agent.optimizer_type = 'Adam'
-        self.clip_gradients = 1000
-
-
-class Doom_Basic_A2C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, Doom, CategoricalExploration)
-        self.env.level = 'basic'
-        self.agent.policy_gradient_rescaler = 'A_VALUE'
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 100
-        self.env.reward_scaling = 100.
-
-
-class Doom_Basic_Dueling_DDQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDQN, Doom, ExplorationParameters)
-        self.env.level = 'basic'
-        self.agent.output_types = [OutputTypes.DuelingQ]
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.agent.num_steps_between_copying_online_weights_to_target = 1000
-        self.num_heatup_steps = 1000
-
-class Doom_Basic_Dueling_DQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DuelingDQN, Doom, ExplorationParameters)
-        self.env.level = 'basic'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.agent.num_steps_between_copying_online_weights_to_target = 1000
-        self.num_heatup_steps = 1000
-
-
-class CartPole_Dueling_DDQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDQN, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.agent.output_types = [OutputTypes.DuelingQ]
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.agent.num_steps_between_copying_online_weights_to_target = 100
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 3000
-        self.agent.discount = 1.0
-
-        self.test = True
-        self.test_max_step_threshold = 100
-        self.test_min_return_threshold = 150
-
-
-class Doom_Health_MMC(Preset):
-    def __init__(self):
-        Preset.__init__(self, MMC, Doom, ExplorationParameters)
-        self.env.level = 'HEALTH_GATHERING'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.agent.num_steps_between_copying_online_weights_to_target = 1000
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 10000
-
-class CartPole_MMC(Preset):
-    def __init__(self):
-        Preset.__init__(self, MMC, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.agent.num_steps_between_copying_online_weights_to_target = 100
-        self.learning_rate = 0.00025
-        self.agent.num_episodes_in_experience_replay = 200
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 3000
-        self.agent.discount = 1.0
-
-        self.test = True
-        self.test_max_step_threshold = 90
-        self.test_min_return_threshold = 150
-
-
-class CartPole_PAL(Preset):
-    def __init__(self):
-        Preset.__init__(self, PAL, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.agent.num_steps_between_copying_online_weights_to_target = 100
-        self.learning_rate = 0.00025
-        self.agent.num_episodes_in_experience_replay = 200
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 3000
-        self.agent.discount = 1.0
-
-        self.test = True
-        self.test_max_step_threshold = 100
-        self.test_min_return_threshold = 150
-
-
-class CartPole_DFP(Preset):
-    def __init__(self):
-        Preset.__init__(self, DFP, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 10000
-        self.agent.use_accumulated_reward_as_measurement = True
-        self.agent.goal_vector = [1.0]
-
-
-class Doom_Basic_DFP(Preset):
-    def __init__(self):
-        Preset.__init__(self, DFP, Doom, ExplorationParameters)
-        self.env.level = 'BASIC'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 10000
-        self.agent.use_accumulated_reward_as_measurement = True
-        self.agent.goal_vector = [0.0, 1.0]
-        # self.agent.num_consecutive_playing_steps = 10
-
-
-class Doom_Health_DFP(Preset):
-    def __init__(self):
-        Preset.__init__(self, DFP, Doom, ExplorationParameters)
-        self.env.level = 'HEALTH_GATHERING'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 10000
-        self.agent.use_accumulated_reward_as_measurement = True
-
-
-class Doom_Deadly_Corridor_Bootstrapped_DQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, BootstrappedDQN, Doom, BootstrappedDQNExploration)
-        self.env.level = 'deadly_corridor'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.agent.num_steps_between_copying_online_weights_to_target = 1000
-        self.num_heatup_steps = 1000
-
-
-class CartPole_Bootstrapped_DQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, BootstrappedDQN, GymVectorObservation, BootstrappedDQNExploration)
-        self.env.level = 'CartPole-v0'
-        self.agent.num_steps_between_copying_online_weights_to_target = 200
-        self.learning_rate = 0.00025
-        self.agent.num_episodes_in_experience_replay = 200
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 3000
-        self.agent.discount = 1.0
-
-        self.test = True
-        self.test_max_step_threshold = 200
-        self.test_min_return_threshold = 150
-
-class CartPole_PG(Preset):
-    def __init__(self):
-        Preset.__init__(self, PolicyGradient, GymVectorObservation, CategoricalExploration)
-        self.env.level = 'CartPole-v0'
-        self.agent.policy_gradient_rescaler = 'FUTURE_RETURN_NORMALIZED_BY_TIMESTEP'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 100
-        self.agent.discount = 1.0
-
-        self.test = True
-        self.test_max_step_threshold = 150
-        self.test_min_return_threshold = 150
-
-
-class CartPole_PPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, PPO, GymVectorObservation, CategoricalExploration)
-        self.env.level = 'CartPole-v0'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 512
-        self.agent.discount = 0.99
-        self.batch_size = 128
-        self.agent.policy_gradient_rescaler = 'A_VALUE'
-        self.agent.optimizer_type = 'LBFGS'
-        self.env.normalize_observation = True
-
-        self.test = True
-        self.test_max_step_threshold = 200
-        self.test_min_return_threshold = 150
-
-class CartPole_ClippedPPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, GymVectorObservation, CategoricalExploration)
-        self.env.level = 'CartPole-v0'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 512
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-        self.test = True
-        self.test_max_step_threshold = 200
-        self.test_min_return_threshold = 150
-
-class CartPole_A2C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, CategoricalExploration)
-        self.env.level = 'CartPole-v0'
-        self.agent.policy_gradient_rescaler = 'A_VALUE'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 200.
-        self.agent.discount = 1.0
-
-        self.test = True
-        self.test_max_step_threshold = 300
-        self.test_min_return_threshold = 150
-
-
-class CartPole_OneStepQ(Preset):
-    def __init__(self):
-        Preset.__init__(self, NStepQ, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.agent.num_steps_between_copying_online_weights_to_target = 100
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.discount = 1.0
-        self.agent.targets_horizon = '1-Step'
-
-
-class CartPole_NStepQ(Preset):
-    def __init__(self):
-        Preset.__init__(self, NStepQ, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.agent.num_steps_between_copying_online_weights_to_target = 100
-        self.learning_rate = 0.0001
-        self.exploration.epsilon_decay_steps = 10000
-        self.num_heatup_steps = 0
-        self.agent.discount = 0.99
-        self.agent.num_steps_between_gradient_updates = 5
-
-        self.test = True
-        self.test_max_step_threshold = 2000
-        self.test_min_return_threshold = 150
-        self.test_num_workers = 8
-
-class CartPole_DQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DQN, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.agent.num_steps_between_copying_online_weights_to_target = 100
-        self.learning_rate = 0.00025
-        self.agent.num_episodes_in_experience_replay = 200
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 3000
-        self.agent.discount = 1.0
-
-        self.test = True
-        self.test_max_step_threshold = 150
-        self.test_min_return_threshold = 150
-
-
-class CartPole_C51(Preset):
-    def __init__(self):
-        Preset.__init__(self, CategoricalDQN, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.agent.num_steps_between_copying_online_weights_to_target = 100
-        self.learning_rate = 0.00025
-        self.agent.num_episodes_in_experience_replay = 200
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 3000
-        self.agent.discount = 1.0
-        # self.env.reward_scaling = 20.
-        self.agent.v_min = 0.0
-        self.agent.v_max = 200.0
-
-        self.test = True
-        self.test_max_step_threshold = 150
-        self.test_min_return_threshold = 150
-
-
-class CartPole_QRDQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, QuantileRegressionDQN, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.agent.num_steps_between_copying_online_weights_to_target = 100
-        self.learning_rate = 0.00025
-        self.agent.num_episodes_in_experience_replay = 200
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 3000
-        self.agent.discount = 1.0
-
-
-# The below preset matches the hyper-parameters setting as in the original DQN paper.
-# This a very resource intensive preset, and might easily blow up your RAM (> 100GB of usage).
-# Try reducing the number of transitions in the experience replay (50e3 might be a reasonable number to start with),
-# so to make sure it fits your RAM.
-class Breakout_DQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DQN, Atari, ExplorationParameters)
-        self.env.level = 'BreakoutDeterministic-v4'
-        self.agent.num_steps_between_copying_online_weights_to_target = 10000
-        self.learning_rate = 0.00025
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.exploration.initial_epsilon = 1.0
-        self.exploration.final_epsilon = 0.1
-        self.exploration.epsilon_decay_steps = 1000000
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.exploration.evaluation_epsilon = 0.05
-        self.num_heatup_steps = 50000
-        self.agent.num_consecutive_playing_steps = 4
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 25
-        self.agent.replace_mse_with_huber_loss = True
-        # self.env.crop_observation = True  # TODO: remove
-        # self.rescaling_interpolation_type = 'nearest' # TODO: remove
-
-
-class Breakout_DDQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDQN, Atari, ExplorationParameters)
-        self.env.level = 'BreakoutDeterministic-v4'
-        self.agent.num_steps_between_copying_online_weights_to_target = 30000
-        self.learning_rate = 0.00025
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.exploration.initial_epsilon = 1.0
-        self.exploration.final_epsilon = 0.01
-        self.exploration.epsilon_decay_steps = 1000000
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.exploration.evaluation_epsilon = 0.001
-        self.num_heatup_steps = 50000
-        self.agent.num_consecutive_playing_steps = 4
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 25
-        self.agent.replace_mse_with_huber_loss = True
-
-
-class Breakout_Dueling_DDQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDQN, Atari, ExplorationParameters)
-        self.env.level = 'BreakoutDeterministic-v4'
-        self.agent.output_types = [OutputTypes.DuelingQ]
-        self.agent.num_steps_between_copying_online_weights_to_target = 30000
-        self.learning_rate = 0.00025
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.exploration.initial_epsilon = 1.0
-        self.exploration.final_epsilon = 0.01
-        self.exploration.epsilon_decay_steps = 1000000
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.exploration.evaluation_epsilon = 0.001
-        self.num_heatup_steps = 50000
-        self.agent.num_consecutive_playing_steps = 4
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 25
-        self.agent.replace_mse_with_huber_loss = True
-
-class Alien_DQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DQN, Atari, ExplorationParameters)
-        self.env.level = 'AlienDeterministic-v4'
-        self.agent.num_steps_between_copying_online_weights_to_target = 10000
-        self.learning_rate = 0.00025
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.exploration.initial_epsilon = 1.0
-        self.exploration.final_epsilon = 0.1
-        self.exploration.epsilon_decay_steps = 1000000
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.exploration.evaluation_epsilon = 0.05
-        self.num_heatup_steps = 50000
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 5
-
-
-class Breakout_C51(Preset):
-    def __init__(self):
-        Preset.__init__(self, CategoricalDQN, Atari, ExplorationParameters)
-        self.env.level = 'BreakoutDeterministic-v4'
-        self.agent.num_steps_between_copying_online_weights_to_target = 10000
-        self.learning_rate = 0.00025
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.exploration.initial_epsilon = 1.0
-        self.exploration.final_epsilon = 0.01
-        self.exploration.epsilon_decay_steps = 1000000
-        self.env.reward_clipping_max = 1.0
-        self.env.reward_clipping_min = -1.0
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.exploration.evaluation_epsilon = 0.001
-        self.num_heatup_steps = 50000
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 5000000
-
-
-
-class Breakout_QRDQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, QuantileRegressionDQN, Atari, ExplorationParameters)
-        self.env.level = 'BreakoutDeterministic-v4'
-        self.agent.num_steps_between_copying_online_weights_to_target = 10000
-        self.learning_rate = 0.00025
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.exploration.initial_epsilon = 1.0
-        self.exploration.final_epsilon = 0.01
-        self.exploration.epsilon_decay_steps = 1000000
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.exploration.evaluation_epsilon = 0.001
-        self.num_heatup_steps = 50000
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 50
-
-
-class Atari_DQN_TestBench(Preset):
-    def __init__(self):
-        Preset.__init__(self, DQN, Atari, ExplorationParameters)
-        self.env.level = 'BreakoutDeterministic-v4'
-        self.agent.num_steps_between_copying_online_weights_to_target = 10000
-        self.learning_rate = 0.00025
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.exploration.initial_epsilon = 1.0
-        self.exploration.final_epsilon = 0.1
-        self.exploration.epsilon_decay_steps = 1000000
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.exploration.evaluation_epsilon = 0.05
-        self.num_heatup_steps = 10000
-        self.evaluation_episodes = 25
-        self.evaluate_every_x_episodes = 1000
-        self.num_training_iterations = 500
-
-
-class Doom_Basic_PG(Preset):
-    def __init__(self):
-        Preset.__init__(self, PolicyGradient, Doom, CategoricalExploration)
-        self.env.level = 'basic'
-        self.agent.policy_gradient_rescaler = 'FUTURE_RETURN_NORMALIZED_BY_TIMESTEP'
-        self.learning_rate = 0.00001
-        self.num_heatup_steps = 0
-        self.agent.beta_entropy = 0.01
-
-
-class InvertedPendulum_PG(Preset):
-    def __init__(self):
-        Preset.__init__(self, PolicyGradient, GymVectorObservation, AdditiveNoiseExploration)
-        self.env.level = 'InvertedPendulum-v1'
-        self.agent.policy_gradient_rescaler = 'FUTURE_RETURN_NORMALIZED_BY_TIMESTEP'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 0
-
-
-class Pendulum_PG(Preset):
-    def __init__(self):
-        Preset.__init__(self, PolicyGradient, GymVectorObservation, AdditiveNoiseExploration)
-        self.env.level = 'Pendulum-v0'
-        self.agent.policy_gradient_rescaler = 'FUTURE_RETURN_NORMALIZED_BY_TIMESTEP'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 0
-        self.agent.apply_gradients_every_x_episodes = 10
-
-
-class Pendulum_DDPG(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDPG, GymVectorObservation, AdditiveNoiseExploration)
-        self.env.level = 'Pendulum-v0'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 1000
-        self.env.normalize_observation = False
-
-        self.test = True
-        self.test_max_step_threshold = 100
-        self.test_min_return_threshold = -250
-
-
-class InvertedPendulum_DDPG(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDPG, GymVectorObservation, OUExploration)
-        self.env.level = 'InvertedPendulum-v1'
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 100
-        self.env.normalize_observation = True
-
-
-class InvertedPendulum_PPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, PPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'InvertedPendulum-v1'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 5000
-        self.agent.discount = 0.99
-        self.batch_size = 128
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.96
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.agent.shared_optimizer = False
-        self.agent.async_training = True
-        self.env.normalize_observation = True
-
-
-class Pendulum_ClippedPPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'Pendulum-v0'
-        self.learning_rate = 0.00005
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-        self.agent.beta_entropy = 0.01
-
-
-class Hopper_DPPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, PPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'Hopper-v1'
-        self.learning_rate = 0.00001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 5000
-        self.agent.discount = 0.99
-        self.batch_size = 128
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.96
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.agent.async_training = True
-        self.env.normalize_observation = True
-
-
-class InvertedPendulum_ClippedPPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'InvertedPendulum-v1'
-        self.learning_rate = 0.00005
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-class Humanoid_ClippedPPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'Humanoid-v1'
-        self.agent.embedder_width = EmbedderWidth.Narrow
-        self.learning_rate = 0.00001
-        self.num_heatup_steps = 0
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 1
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-
-class Hopper_ClippedPPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'Hopper-v1'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-
-class InvertedPendulum_ClippedPPO_Roboschool(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, Roboschool, ExplorationParameters)
-        self.env.level = 'RoboschoolInvertedPendulum-v1'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-
-class HalfCheetah_ClippedPPO_Roboschool(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, Roboschool, ExplorationParameters)
-        self.env.level = 'RoboschoolHalfCheetah-v1'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-
-class Hopper_ClippedPPO_Roboschool(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, Roboschool, ExplorationParameters)
-        self.env.level = 'RoboschoolHopper-v1'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-
-class Ant_ClippedPPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'Ant-v1'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-
-class Hopper_ClippedPPO_Distributed(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'Hopper-v1'
-        self.learning_rate = 0.00001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 10000
-        self.agent.discount = 0.99
-        self.batch_size = 128
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'LBFGS'
-        self.env.normalize_observation = True
-
-
-class Hopper_DDPG_Roboschool(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDPG, Roboschool, OUExploration)
-        self.env.level = 'RoboschoolHopper-v1'
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 100
-
-
-class Hopper_PPO_Roboschool(Preset):
-    def __init__(self):
-        Preset.__init__(self, PPO, Roboschool, ExplorationParameters)
-        self.env.level = 'RoboschoolHopper-v1'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 5000
-        self.agent.discount = 0.99
-        self.batch_size = 128
-        self.agent.policy_gradient_rescaler = 'GENERALIZED_ADVANTAGE_ESTIMATION'
-        self.agent.gae_lambda = 0.96
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'LBFGS'
-
-
-class Hopper_DDPG(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDPG, GymVectorObservation, OUExploration)
-        self.env.level = 'Hopper-v1'
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 100
-        self.env.normalize_observation = True
-
-
-class Hopper_DDDPG(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDDPG, GymVectorObservation, OUExploration)
-        self.env.level = 'Hopper-v1'
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 100
-        self.env.normalize_observation = True
-
-
-class Hopper_PPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, PPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'Hopper-v1'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 5000
-        self.agent.discount = 0.99
-        self.batch_size = 128
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.96
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'LBFGS'
-        # self.clip_gradients = 2
-        self.env.normalize_observation = True
-
-
-class Walker_PPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, PPO, GymVectorObservation, AdditiveNoiseExploration)
-        self.env.level = 'Walker2d-v1'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 5000
-        self.agent.discount = 0.99
-        self.batch_size = 128
-        self.agent.policy_gradient_rescaler = 'A_VALUE'
-        self.agent.gae_lambda = 0.96
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'LBFGS'
-        self.env.normalize_observation = True
-
-
-class HalfCheetah_DDPG(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDPG, GymVectorObservation, OUExploration)
-        self.env.level = 'HalfCheetah-v1'
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 1000
-        self.env.normalize_observation = True
-
-
-class Ant_DDPG(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDPG, GymVectorObservation, OUExploration)
-        self.env.level = 'Ant-v1'
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 1000
-        self.env.normalize_observation = True
-
-
-class Pendulum_NAF(Preset):
-    def __init__(self):
-        Preset.__init__(self, NAF, GymVectorObservation, AdditiveNoiseExploration)
-        self.env.level = 'Pendulum-v0'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 1000
-        self.batch_size = 100
-        # self.env.reward_scaling = 1000
-
-        self.test = True
-        self.test_max_step_threshold = 100
-        self.test_min_return_threshold = -250
-
-
-class InvertedPendulum_NAF(Preset):
-    def __init__(self):
-        Preset.__init__(self, NAF, GymVectorObservation, AdditiveNoiseExploration)
-        self.env.level = 'InvertedPendulum-v1'
-        self.learning_rate = 0.001
-        self.num_heatup_steps = 1000
-        self.batch_size = 100
-
-
-class Hopper_NAF(Preset):
-    def __init__(self):
-        Preset.__init__(self, NAF, GymVectorObservation, AdditiveNoiseExploration)
-        self.env.level = 'Hopper-v1'
-        self.learning_rate = 0.0005
-        self.num_heatup_steps = 1000
-        self.batch_size = 100
-        self.agent.async_training = True
-        self.env.normalize_observation = True
-
-
-class CartPole_NEC(Preset):
-    def __init__(self):
-        Preset.__init__(self, NEC, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'CartPole-v0'
-        self.learning_rate = 0.00025
-        self.agent.num_episodes_in_experience_replay = 200
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 1000
-        self.exploration.final_epsilon = 0.1
-        self.agent.discount = 0.99
-        self.seed = 0
-
-        self.test = True
-        self.test_max_step_threshold = 200
-        self.test_min_return_threshold = 150
-
-
-class Doom_Basic_NEC(Preset):
-    def __init__(self):
-        Preset.__init__(self, NEC, Doom, ExplorationParameters)
-        self.env.level = 'basic'
-        self.learning_rate = 0.00001
-        self.agent.num_transitions_in_experience_replay = 100000
-        # self.exploration.initial_epsilon = 0.1  # TODO: try exploration
-        # self.exploration.final_epsilon = 0.1
-        # self.exploration.epsilon_decay_steps = 1000000
-        self.num_heatup_steps = 200
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 5
-        self.seed = 123
-
-
-
-class Montezuma_NEC(Preset):
-    def __init__(self):
-        Preset.__init__(self, NEC, Atari, ExplorationParameters)
-        self.env.level = 'MontezumaRevenge-v0'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 1000
-        self.agent.num_playing_steps_between_two_training_steps = 1
-
-
-class Breakout_NEC(Preset):
-    def __init__(self):
-        Preset.__init__(self, NEC, Atari, ExplorationParameters)
-        self.env.level = 'BreakoutDeterministic-v4'
-        self.agent.num_steps_between_copying_online_weights_to_target = 10000
-        self.learning_rate = 0.00001
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.exploration.initial_epsilon = 0.1
-        self.exploration.final_epsilon = 0.1
-        self.exploration.epsilon_decay_steps = 1000000
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.exploration.evaluation_epsilon = 0.05
-        self.num_heatup_steps = 1000
-        self.env.reward_clipping_max = None
-        self.env.reward_clipping_min = None
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 25
-        self.seed = 123
-
-
-class Doom_Health_NEC(Preset):
-    def __init__(self):
-        Preset.__init__(self, NEC, Doom, ExplorationParameters)
-        self.env.level = 'HEALTH_GATHERING'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 10000
-        self.agent.num_playing_steps_between_two_training_steps = 1
-
-
-class Doom_Health_DQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DQN, Doom, ExplorationParameters)
-        self.env.level = 'HEALTH_GATHERING'
-        self.agent.num_episodes_in_experience_replay = 200
-        self.learning_rate = 0.00025
-        self.num_heatup_steps = 1000
-        self.exploration.epsilon_decay_steps = 10000
-        self.agent.num_steps_between_copying_online_weights_to_target = 1000
-
-
-class Pong_NEC_LSTM(Preset):
-    def __init__(self):
-        Preset.__init__(self, NEC, Atari, ExplorationParameters)
-        self.env.level = 'PongDeterministic-v4'
-        self.learning_rate = 0.001
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.agent.middleware_type = MiddlewareTypes.LSTM
-        self.exploration.initial_epsilon = 0.5
-        self.exploration.final_epsilon = 0.1
-        self.exploration.epsilon_decay_steps = 1000000
-        self.num_heatup_steps = 500
-
-
-class Pong_NEC(Preset):
-    def __init__(self):
-        Preset.__init__(self, NEC, Atari, ExplorationParameters)
-        self.env.level = 'PongDeterministic-v4'
-        self.learning_rate = 0.00001
-        self.agent.num_transitions_in_experience_replay = 100000
-        self.exploration.initial_epsilon = 0.1  # TODO: try exploration
-        self.exploration.final_epsilon = 0.1
-        self.exploration.epsilon_decay_steps = 1000000
-        self.num_heatup_steps = 2000
-        self.env.reward_clipping_max = None
-        self.env.reward_clipping_min = None
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 5
-        self.env.crop_observation = True  # TODO: remove
-        self.env.random_initialization_steps = 1  # TODO: remove
-        # self.seed = 123
-
-
-class Alien_NEC(Preset):
-    def __init__(self):
-        Preset.__init__(self, NEC, Atari, ExplorationParameters)
-        self.env.level = 'AlienDeterministic-v4'
-        self.learning_rate = 0.0001
-        self.agent.num_transitions_in_experience_replay = 100000
-        self.exploration.initial_epsilon = 0.1  # TODO: try exploration
-        self.exploration.final_epsilon = 0.1
-        self.exploration.epsilon_decay_steps = 1000000
-        self.num_heatup_steps = 3000
-        self.env.reward_clipping_max = None
-        self.env.reward_clipping_min = None
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 5
-        self.seed = 123
-
-
-class Pong_DQN(Preset):
-    def __init__(self):
-        Preset.__init__(self, DQN, Atari, ExplorationParameters)
-        self.env.level = 'PongDeterministic-v4'
-        self.agent.num_steps_between_copying_online_weights_to_target = 10000
-        self.learning_rate = 0.00025
-        self.agent.num_transitions_in_experience_replay = 1000000
-        self.exploration.initial_epsilon = 1.0
-        self.exploration.final_epsilon = 0.1
-        self.exploration.epsilon_decay_steps = 1000000
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.exploration.evaluation_epsilon = 0.05
-        self.num_heatup_steps = 50000
-        self.evaluation_episodes = 1
-        self.evaluate_every_x_episodes = 5
-        self.seed = 123
-
-
-class CartPole_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, CategoricalExploration)
-        self.env.level = 'CartPole-v0'
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 200.
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.01
-        self.agent.num_steps_between_gradient_updates = 5
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-        self.test = True
-        self.test_max_step_threshold = 1000
-        self.test_min_return_threshold = 150
-        self.test_num_workers = 8
-
-
-class MountainCar_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, CategoricalExploration)
-        self.env.level = 'MountainCar-v0'
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 200.
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.01
-        self.agent.num_steps_between_gradient_updates = 5
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-
-class InvertedPendulum_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, EntropyExploration)
-        self.env.level = 'InvertedPendulum-v1'
-        self.agent.policy_gradient_rescaler = 'A_VALUE'
-        self.agent.optimizer_type = 'Adam'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 200.
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 30
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.005
-        self.clip_gradients = 40
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-
-class Hopper_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, EntropyExploration)
-        self.env.level = 'Hopper-v1'
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.optimizer_type = 'Adam'
-        self.learning_rate = 0.00002
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 20.
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 20
-        self.agent.gae_lambda = 0.98
-        self.agent.beta_entropy = 0.005
-        self.clip_gradients = 40
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-
-class HopperIceWall_A3C(Hopper_A3C):
-    def __init__(self):
-        Hopper_A3C.__init__(self)
-        self.env.level = 'HopperIceWall-v0'
-
-
-class HopperStairs_A3C(Hopper_A3C):
-    def __init__(self):
-        Hopper_A3C.__init__(self)
-        self.env.level = 'HopperStairs-v0'
-
-
-class HopperBullet_A3C(Hopper_A3C):
-    def __init__(self):
-        Hopper_A3C.__init__(self)
-        self.env.level = 'HopperBulletEnv-v0'
-
-
-class Kuka_ClippedPPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'KukaBulletEnv-v0'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-
-class Minitaur_ClippedPPO(Preset):
-    def __init__(self):
-        Preset.__init__(self, ClippedPPO, GymVectorObservation, ExplorationParameters)
-        self.env.level = 'MinitaurBulletEnv-v0'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.num_consecutive_training_steps = 1
-        self.agent.num_consecutive_playing_steps = 2048
-        self.agent.discount = 0.99
-        self.batch_size = 64
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.gae_lambda = 0.95
-        self.visualization.dump_csv = True
-        self.agent.optimizer_type = 'Adam'
-        self.env.normalize_observation = True
-
-
-class Walker_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, EntropyExploration)
-        self.env.level = 'Walker2d-v1'
-        self.agent.policy_gradient_rescaler = 'A_VALUE'
-        self.agent.optimizer_type = 'Adam'
-        self.learning_rate = 0.00002
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 20.
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 20
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.005
-        self.clip_gradients = 40
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-
-class Ant_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, EntropyExploration)
-        self.env.level = 'Ant-v1'
-        self.agent.policy_gradient_rescaler = 'A_VALUE'
-        self.agent.optimizer_type = 'Adam'
-        self.learning_rate = 0.00002
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 20.
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 20
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.005
-        self.clip_gradients = 40
-        self.agent.middleware_type = MiddlewareTypes.FC
-        self.env.normalize_observation = True
-
-
-class AntBullet_A3C(Ant_A3C):
-    def __init__(self):
-        Ant_A3C.__init__(self)
-        self.env.level = 'AntBulletEnv-v0'
-
-
-class AntMaze_A3C(Ant_A3C):
-    def __init__(self):
-        Ant_A3C.__init__(self)
-        self.env.level = 'AntMaze-v0'
-
-
-class Humanoid_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, EntropyExploration)
-        self.env.level = 'Humanoid-v1'
-        self.agent.policy_gradient_rescaler = 'A_VALUE'
-        self.agent.optimizer_type = 'Adam'
-        self.learning_rate = 0.00002
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 20.
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 20
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.005
-        self.clip_gradients = 40
-        self.agent.middleware_type = MiddlewareTypes.FC
-        self.env.normalize_observation = True
-
-
-class Pendulum_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, EntropyExploration)
-        self.env.level = 'Pendulum-v0'
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.agent.optimizer_type = 'Adam'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.agent.discount = 0.99
-        self.agent.num_steps_between_gradient_updates = 5
-        self.agent.gae_lambda = 1
-
-
-
-class BipedalWalker_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, GymVectorObservation, EntropyExploration)
-        self.env.level = 'BipedalWalker-v2'
-        self.agent.policy_gradient_rescaler = 'A_VALUE'
-        self.agent.optimizer_type = 'RMSProp'
-        self.learning_rate = 0.00002
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 50.
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 10
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.005
-        self.clip_gradients = None
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-
-class Doom_Basic_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, Doom, CategoricalExploration)
-        self.env.level = 'basic'
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 100.
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 30
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.01
-        self.clip_gradients = 40
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-
-class Pong_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, Atari, CategoricalExploration)
-        self.env.level = 'PongDeterministic-v4'
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        self.env.reward_scaling = 1.
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 20
-        self.agent.gae_lambda = 1.
-        self.agent.beta_entropy = 0.01
-        self.clip_gradients = 40.0
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-
-class Breakout_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, Atari, CategoricalExploration)
-        self.env.level = 'BreakoutDeterministic-v4'
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 200
-        self.env.reward_scaling = 1.
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 20
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.05
-        self.clip_gradients = 40.0
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-
-class Carla_A3C(Preset):
-    def __init__(self):
-        Preset.__init__(self, ActorCritic, Carla, EntropyExploration)
-        self.agent.embedder_complexity = EmbedderDepth.Deep
-        self.agent.policy_gradient_rescaler = 'GAE'
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 0
-        # self.env.reward_scaling = 1.0e9
-        self.agent.discount = 0.99
-        self.agent.apply_gradients_every_x_episodes = 1
-        self.agent.num_steps_between_gradient_updates = 30
-        self.agent.gae_lambda = 1
-        self.agent.beta_entropy = 0.01
-        self.clip_gradients = 40
-        self.agent.middleware_type = MiddlewareTypes.FC
-
-
-class Carla_DDPG(Preset):
-    def __init__(self):
-        Preset.__init__(self, DDPG, Carla, OUExploration)
-        self.agent.embedder_complexity = EmbedderDepth.Deep
-        self.learning_rate = 0.0001
-        self.num_heatup_steps = 1000
-        self.agent.num_consecutive_training_steps = 5
-
-
-class Carla_BC(Preset):
-    def __init__(self):
-        Preset.__init__(self, BC, Carla, ExplorationParameters)
-        self.agent.embedder_complexity = EmbedderDepth.Deep
-        self.agent.load_memory_from_file_path = 'datasets/carla_town1.p'
-        self.learning_rate = 0.0005
-        self.num_heatup_steps = 0
-        self.evaluation_episodes = 5
-        self.batch_size = 120
-        self.evaluate_every_x_training_iterations = 5000
-
-
-class Doom_Basic_BC(Preset):
-    def __init__(self):
-        Preset.__init__(self, BC, Doom, ExplorationParameters)
-        self.env.level = 'basic'
-        self.agent.load_memory_from_file_path = 'datasets/doom_basic.p'
-        self.learning_rate = 0.0005
-        self.num_heatup_steps = 0
-        self.evaluation_episodes = 5
-        self.batch_size = 120
-        self.evaluate_every_x_training_iterations = 100
-        self.num_training_iterations = 2000
-
-
-class Doom_Defend_BC(Preset):
-    def __init__(self):
-        Preset.__init__(self, BC, Doom, ExplorationParameters)
-        self.env.level = 'defend'
-        self.agent.load_memory_from_file_path = 'datasets/doom_defend.p'
-        self.learning_rate = 0.0005
-        self.num_heatup_steps = 0
-        self.evaluation_episodes = 5
-        self.batch_size = 120
-        self.evaluate_every_x_training_iterations = 100
-
-
-class Doom_Deathmatch_BC(Preset):
-    def __init__(self):
-        Preset.__init__(self, BC, Doom, ExplorationParameters)
-        self.env.level = 'deathmatch'
-        self.agent.load_memory_from_file_path = 'datasets/doom_deathmatch.p'
-        self.learning_rate = 0.0005
-        self.num_heatup_steps = 0
-        self.evaluation_episodes = 5
-        self.batch_size = 120
-        self.evaluate_every_x_training_iterations = 100
-
-
-class MontezumaRevenge_BC(Preset):
-    def __init__(self):
-        Preset.__init__(self, BC, Atari, ExplorationParameters)
-        self.env.level = 'MontezumaRevenge-v0'
-        self.agent.load_memory_from_file_path = 'datasets/montezuma_revenge.p'
-        self.learning_rate = 0.0005
-        self.num_heatup_steps = 0
-        self.evaluation_episodes = 5
-        self.batch_size = 120
-        self.evaluate_every_x_training_iterations = 100
-        self.exploration.evaluation_epsilon = 0.05
-        self.exploration.evaluation_policy = 'EGreedy'
-        self.env.frame_skip = 1
diff --git a/requirements_coach.txt b/requirements_coach.txt
deleted file mode 100644
index a0fd644..0000000
--- a/requirements_coach.txt
+++ /dev/null
@@ -1,9 +0,0 @@
-annoy==1.8.3
-Pillow==4.3.0
-matplotlib==2.0.2
-numpy==1.13.0
-pandas==0.20.2
-pygame==1.9.3
-PyOpenGL==3.1.0
-scipy==0.19.0
-scikit-image==0.13.0
diff --git a/requirements_dashboard.txt b/requirements_dashboard.txt
deleted file mode 100644
index bb4d955..0000000
--- a/requirements_dashboard.txt
+++ /dev/null
@@ -1,4 +0,0 @@
-tornado==4.5.1
-bokeh==0.12.7
-futures==3.1.1
-pandas==0.20.2
diff --git a/rl_coach/__init__.py b/rl_coach/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/memories/__init__.py b/rl_coach/agents/__init__.py
similarity index 81%
rename from memories/__init__.py
rename to rl_coach/agents/__init__.py
index 29a0894..cf26739 100644
--- a/memories/__init__.py
+++ b/rl_coach/agents/__init__.py
@@ -13,7 +13,3 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-
-from memories.differentiable_neural_dictionary import *
-from memories.episodic_experience_replay import *
-from memories.memory import *
diff --git a/rl_coach/agents/actor_critic_agent.py b/rl_coach/agents/actor_critic_agent.py
new file mode 100644
index 0000000..df549b7
--- /dev/null
+++ b/rl_coach/agents/actor_critic_agent.py
@@ -0,0 +1,165 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+import scipy.signal
+from rl_coach.agents.policy_optimization_agent import PolicyOptimizationAgent, PolicyGradientRescaler
+from rl_coach.architectures.tensorflow_components.heads.policy_head import PolicyHeadParameters
+from rl_coach.architectures.tensorflow_components.heads.v_head import VHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import AlgorithmParameters, NetworkParameters, \
+    AgentParameters, InputEmbedderParameters
+from rl_coach.core_types import QActionStateValue
+from rl_coach.spaces import DiscreteActionSpace
+from rl_coach.utils import last_sample
+
+from rl_coach.logger import screen
+from rl_coach.memories.episodic.single_episode_buffer import SingleEpisodeBufferParameters
+
+
+class ActorCriticAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.policy_gradient_rescaler = PolicyGradientRescaler.A_VALUE
+        self.apply_gradients_every_x_episodes = 5
+        self.beta_entropy = 0
+        self.num_steps_between_gradient_updates = 5000  # this is called t_max in all the papers
+        self.gae_lambda = 0.96
+        self.estimate_state_value_using_gae = False
+
+
+class ActorCriticNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters()}
+        self.middleware_parameters = FCMiddlewareParameters()
+        self.heads_parameters = [VHeadParameters(), PolicyHeadParameters()]
+        self.loss_weights = [0.5, 1.0]
+        self.rescale_gradient_from_head_by_factor = [1, 1]
+        self.optimizer_type = 'Adam'
+        self.clip_gradients = 40.0
+        self.async_training = True
+
+
+class ActorCriticAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=ActorCriticAlgorithmParameters(),
+                         exploration=None, #TODO this should be different for continuous (ContinuousEntropyExploration)
+                                           #  and discrete (CategoricalExploration) action spaces.
+                         memory=SingleEpisodeBufferParameters(),
+                         networks={"main": ActorCriticNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.actor_critic_agent:ActorCriticAgent'
+
+
+# Actor Critic - https://arxiv.org/abs/1602.01783
+class ActorCriticAgent(PolicyOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.last_gradient_update_step_idx = 0
+        self.action_advantages = self.register_signal('Advantages')
+        self.state_values = self.register_signal('Values')
+        self.value_loss = self.register_signal('Value Loss')
+        self.policy_loss = self.register_signal('Policy Loss')
+
+    # Discounting function used to calculate discounted returns.
+    def discount(self, x, gamma):
+        return scipy.signal.lfilter([1], [1, -gamma], x[::-1], axis=0)[::-1]
+
+    def get_general_advantage_estimation_values(self, rewards, values):
+        # values contain n+1 elements (t ... t+n+1), rewards contain n elements (t ... t + n)
+        bootstrap_extended_rewards = np.array(rewards.tolist() + [values[-1]])
+
+        # Approximation based calculation of GAE (mathematically correct only when Tmax = inf,
+        # although in practice works even in much smaller Tmax values, e.g. 20)
+        deltas = rewards + self.ap.algorithm.discount * values[1:] - values[:-1]
+        gae = self.discount(deltas, self.ap.algorithm.discount * self.ap.algorithm.gae_lambda)
+
+        if self.ap.algorithm.estimate_state_value_using_gae:
+            discounted_returns = np.expand_dims(gae + values[:-1], -1)
+        else:
+            discounted_returns = np.expand_dims(np.array(self.discount(bootstrap_extended_rewards,
+                                                                       self.ap.algorithm.discount)), 1)[:-1]
+        return gae, discounted_returns
+
+    def learn_from_batch(self, batch):
+        # batch contains a list of episodes to learn from
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        # get the values for the current states
+
+        result = self.networks['main'].online_network.predict(batch.states(network_keys))
+        current_state_values = result[0]
+
+        self.state_values.add_sample(current_state_values)
+
+        # the targets for the state value estimator
+        num_transitions = batch.size
+        state_value_head_targets = np.zeros((num_transitions, 1))
+
+        # estimate the advantage function
+        action_advantages = np.zeros((num_transitions, 1))
+
+        if self.policy_gradient_rescaler == PolicyGradientRescaler.A_VALUE:
+            if batch.game_overs()[-1]:
+                R = 0
+            else:
+                R = self.networks['main'].online_network.predict(last_sample(batch.next_states(network_keys)))[0]
+
+            for i in reversed(range(num_transitions)):
+                R = batch.rewards()[i] + self.ap.algorithm.discount * R
+                state_value_head_targets[i] = R
+                action_advantages[i] = R - current_state_values[i]
+
+        elif self.policy_gradient_rescaler == PolicyGradientRescaler.GAE:
+            # get bootstraps
+            bootstrapped_value = self.networks['main'].online_network.predict(last_sample(batch.next_states(network_keys)))[0]
+            values = np.append(current_state_values, bootstrapped_value)
+            if batch.game_overs()[-1]:
+                values[-1] = 0
+
+            # get general discounted returns table
+            gae_values, state_value_head_targets = self.get_general_advantage_estimation_values(batch.rewards(), values)
+            action_advantages = np.vstack(gae_values)
+        else:
+            screen.warning("WARNING: The requested policy gradient rescaler is not available")
+
+        action_advantages = action_advantages.squeeze(axis=-1)
+        actions = batch.actions()
+        if not isinstance(self.spaces.action, DiscreteActionSpace) and len(actions.shape) < 2:
+            actions = np.expand_dims(actions, -1)
+
+        # train
+        result = self.networks['main'].online_network.accumulate_gradients({**batch.states(network_keys),
+                                                                            'output_1_0': actions},
+                                                                       [state_value_head_targets, action_advantages])
+
+        # logging
+        total_loss, losses, unclipped_grads = result[:3]
+        self.action_advantages.add_sample(action_advantages)
+        self.unclipped_grads.add_sample(unclipped_grads)
+        self.value_loss.add_sample(losses[0])
+        self.policy_loss.add_sample(losses[1])
+
+        return total_loss, losses, unclipped_grads
+
+    def get_prediction(self, states):
+        tf_input_state = self.prepare_batch_for_inference(states, "main")
+        return self.networks['main'].online_network.predict(tf_input_state)[1:]  # index 0 is the state value
diff --git a/rl_coach/agents/agent.py b/rl_coach/agents/agent.py
new file mode 100644
index 0000000..b6c7241
--- /dev/null
+++ b/rl_coach/agents/agent.py
@@ -0,0 +1,791 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+import random
+from collections import OrderedDict
+from typing import Dict, List, Union, Tuple
+
+import numpy as np
+
+from rl_coach.agents.agent_interface import AgentInterface
+from rl_coach.base_parameters import AgentParameters, DistributedTaskParameters
+from rl_coach.core_types import RunPhase, PredictionType, EnvironmentEpisodes, ActionType, Batch, Episode, StateType
+from rl_coach.core_types import Transition, ActionInfo, TrainingSteps, EnvironmentSteps, EnvResponse
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplay
+from pandas import read_pickle
+from six.moves import range
+from rl_coach.spaces import SpacesDefinition, VectorObservationSpace, GoalsSpace, AttentionActionSpace
+from rl_coach.utils import Signal, force_list, set_cpu
+from rl_coach.utils import dynamic_import_and_instantiate_module_from_params
+
+from rl_coach.architectures.network_wrapper import NetworkWrapper
+from rl_coach.logger import screen, Logger, EpisodeLogger
+
+
+class Agent(AgentInterface):
+    def __init__(self, agent_parameters: AgentParameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        """
+        :param agent_parameters: A Preset class instance with all the running paramaters
+        """
+        super().__init__()
+        self.ap = agent_parameters
+        self.task_id = self.ap.task_parameters.task_index
+        self.is_chief = self.task_id == 0
+        self.shared_memory = type(agent_parameters.task_parameters) == DistributedTaskParameters \
+                             and self.ap.memory.shared_memory
+        if self.shared_memory:
+            self.shared_memory_scratchpad = self.ap.task_parameters.shared_memory_scratchpad
+        self.name = agent_parameters.name
+        self.parent = parent
+        self.parent_level_manager = None
+        self.full_name_id = agent_parameters.full_name_id = self.name
+
+        if type(agent_parameters.task_parameters) == DistributedTaskParameters:
+            screen.log_title("Creating agent - name: {} task id: {} (may take up to 30 seconds due to "
+                             "tensorflow wake up time)".format(self.full_name_id, self.task_id))
+        else:
+            screen.log_title("Creating agent - name: {}".format(self.full_name_id))
+        self.imitation = False
+        self.agent_logger = Logger()
+        self.agent_episode_logger = EpisodeLogger()
+
+        # get the memory
+        # - distributed training + shared memory:
+        #   * is chief?  -> create the memory and add it to the scratchpad
+        #   * not chief? -> wait for the chief to create the memory and then fetch it
+        # - non distributed training / not shared memory:
+        #   * create memory
+        memory_name = self.ap.memory.path.split(':')[1]
+        self.memory_lookup_name = self.full_name_id + '.' + memory_name
+        if self.shared_memory and not self.is_chief:
+            self.memory = self.shared_memory_scratchpad.get(self.memory_lookup_name)
+        else:
+            # modules
+            if agent_parameters.memory.load_memory_from_file_path:
+                screen.log_title("Loading replay buffer from pickle. Pickle path: {}"
+                                 .format(agent_parameters.memory.load_memory_from_file_path))
+                self.memory = read_pickle(agent_parameters.memory.load_memory_from_file_path)
+            else:
+                self.memory = dynamic_import_and_instantiate_module_from_params(self.ap.memory)
+
+            if self.shared_memory and self.is_chief:
+                self.shared_memory_scratchpad.add(self.memory_lookup_name, self.memory)
+
+        # set devices
+        if type(agent_parameters.task_parameters) == DistributedTaskParameters:
+            self.has_global = True
+            self.replicated_device = agent_parameters.task_parameters.device
+            self.worker_device = "/job:worker/task:{}".format(self.task_id)
+        else:
+            self.has_global = False
+            self.replicated_device = None
+            self.worker_device = ""
+        if agent_parameters.task_parameters.use_cpu:
+            self.worker_device += "/cpu:0"
+        else:
+            self.worker_device += "/device:GPU:0"
+
+        # filters
+        self.input_filter = self.ap.input_filter
+        self.output_filter = self.ap.output_filter
+        self.pre_network_filter = self.ap.pre_network_filter
+        device = self.replicated_device if self.replicated_device else self.worker_device
+        self.input_filter.set_device(device)
+        self.output_filter.set_device(device)
+        self.pre_network_filter.set_device(device)
+
+
+        # initialize all internal variables
+        self._phase = RunPhase.HEATUP
+        self.total_shaped_reward_in_current_episode = 0
+        self.total_reward_in_current_episode = 0
+        self.total_steps_counter = 0
+        self.running_reward = None
+        self.training_iteration = 0
+        self.last_target_network_update_step = 0
+        self.last_training_phase_step = 0
+        self.current_episode = self.ap.current_episode = 0
+        self.curr_state = {}
+        self.current_hrl_goal = None
+        self.current_episode_steps_counter = 0
+        self.episode_running_info = {}
+        self.last_episode_evaluation_ran = 0
+        self.running_observations = []
+        self.agent_logger.set_current_time(self.current_episode)
+        self.exploration_policy = None
+        self.networks = {}
+        self.last_action_info = None
+        self.running_observation_stats = None
+        self.running_reward_stats = None
+        self.accumulated_rewards_across_evaluation_episodes = 0
+        self.accumulated_shaped_rewards_across_evaluation_episodes = 0
+        self.num_successes_across_evaluation_episodes = 0
+        self.num_evaluation_episodes_completed = 0
+        self.current_episode_buffer = Episode(discount=self.ap.algorithm.discount)
+        # TODO: add agents observation rendering for debugging purposes (not the same as the environment rendering)
+
+        # environment parameters
+        self.spaces = None
+        self.in_action_space = self.ap.algorithm.in_action_space
+
+        # signals
+        self.episode_signals = []
+        self.step_signals = []
+        self.loss = self.register_signal('Loss')
+        self.curr_learning_rate = self.register_signal('Learning Rate')
+        self.unclipped_grads = self.register_signal('Grads (unclipped)')
+        self.reward = self.register_signal('Reward', dump_one_value_per_episode=False, dump_one_value_per_step=True)
+        self.shaped_reward = self.register_signal('Shaped Reward', dump_one_value_per_episode=False, dump_one_value_per_step=True)
+        if isinstance(self.in_action_space, GoalsSpace):
+            self.distance_from_goal = self.register_signal('Distance From Goal', dump_one_value_per_step=True)
+
+        # use seed
+        if self.ap.task_parameters.seed is not None:
+            random.seed(self.ap.task_parameters.seed)
+            np.random.seed(self.ap.task_parameters.seed)
+
+    @property
+    def parent(self):
+        """
+        Get the parent class of the agent
+        :return: the current phase
+        """
+        return self._parent
+
+    @parent.setter
+    def parent(self, val):
+        """
+        Change the parent class of the agent.
+        Additionally, updates the full name of the agent
+        :param val: the new parent
+        :return: None
+        """
+        self._parent = val
+        if self._parent is not None:
+            if not hasattr(self._parent, 'name'):
+                raise ValueError("The parent of an agent must have a name")
+            self.full_name_id = self.ap.full_name_id = "{}/{}".format(self._parent.name, self.name)
+
+    def setup_logger(self):
+        # dump documentation
+        logger_prefix = "{graph_name}.{level_name}.{agent_full_id}".\
+            format(graph_name=self.parent_level_manager.parent_graph_manager.name,
+                   level_name=self.parent_level_manager.name,
+                   agent_full_id='.'.join(self.full_name_id.split('/')))
+        self.agent_logger.set_logger_filenames(self.ap.task_parameters.experiment_path, logger_prefix=logger_prefix,
+                                               add_timestamp=True, task_id=self.task_id)
+        if self.ap.visualization.dump_in_episode_signals:
+            self.agent_episode_logger.set_logger_filenames(self.ap.task_parameters.experiment_path,
+                                                           logger_prefix=logger_prefix,
+                                                           add_timestamp=True, task_id=self.task_id)
+
+    def set_session(self, sess) -> None:
+        """
+        Set the deep learning framework session for all the agents in the composite agent
+        :return: None
+        """
+        self.input_filter.set_session(sess)
+        self.output_filter.set_session(sess)
+        self.pre_network_filter.set_session(sess)
+        [network.set_session(sess) for network in self.networks.values()]
+
+    def register_signal(self, signal_name: str, dump_one_value_per_episode: bool=True,
+                        dump_one_value_per_step: bool=False) -> Signal:
+        """
+        Register a signal such that its statistics will be dumped and be viewable through dashboard
+        :param signal_name: the name of the signal as it will appear in dashboard
+        :param dump_one_value_per_episode: should the signal value be written for each episode?
+        :param dump_one_value_per_step: should the signal value be written for each step?
+        :return: the created signal
+        """
+        signal = Signal(signal_name)
+        if dump_one_value_per_episode:
+            self.episode_signals.append(signal)
+        if dump_one_value_per_step:
+            self.step_signals.append(signal)
+        return signal
+
+    def set_environment_parameters(self, spaces: SpacesDefinition):
+        """
+        Sets the parameters that are environment dependent. As a side effect, initializes all the components that are
+        dependent on those values, by calling init_environment_dependent_modules
+        :param spaces: the environment spaces definition
+        :return: None
+        """
+        self.spaces = copy.deepcopy(spaces)
+
+        if self.ap.algorithm.use_accumulated_reward_as_measurement:
+            if 'measurements' in self.spaces.state.sub_spaces:
+                self.spaces.state['measurements'].shape += 1
+                self.spaces.state['measurements'].measurements_names += ['accumulated_reward']
+            else:
+                self.spaces.state['measurements'] = VectorObservationSpace(1, measurements_names=['accumulated_reward'])
+
+        for observation_name in self.spaces.state.sub_spaces.keys():
+            self.spaces.state[observation_name] = \
+                self.pre_network_filter.get_filtered_observation_space(observation_name,
+                    self.input_filter.get_filtered_observation_space(observation_name,
+                                                                     self.spaces.state[observation_name]))
+
+        self.spaces.reward = self.pre_network_filter.get_filtered_reward_space(
+            self.input_filter.get_filtered_reward_space(self.spaces.reward))
+
+        self.spaces.action = self.output_filter.get_unfiltered_action_space(self.spaces.action)
+
+        if isinstance(self.in_action_space, GoalsSpace):
+            # TODO: what if the goal type is an embedding / embedding change?
+            self.spaces.goal = self.in_action_space
+            self.spaces.goal.set_target_space(self.spaces.state[self.spaces.goal.goal_name])
+
+        self.init_environment_dependent_modules()
+
+    def create_networks(self) -> Dict[str, NetworkWrapper]:
+        """
+        Create all the networks of the agent.
+        The network creation will be done after setting the environment parameters for the agent, since they are needed
+        for creating the network.
+        :return: A list containing all the networks
+        """
+        networks = {}
+        for network_name in sorted(self.ap.network_wrappers.keys()):
+            networks[network_name] = NetworkWrapper(name=network_name,
+                                                    agent_parameters=self.ap,
+                                                    has_target=self.ap.network_wrappers[network_name].create_target_network,
+                                                    has_global=self.has_global,
+                                                    spaces=self.spaces,
+                                                    replicated_device=self.replicated_device,
+                                                    worker_device=self.worker_device)
+        return networks
+
+    def init_environment_dependent_modules(self) -> None:
+        """
+        Initialize any modules that depend on knowing information about the environment such as the action space or
+        the observation space
+        :return: None
+        """
+        # initialize exploration policy
+        self.ap.exploration.action_space = self.spaces.action
+        self.exploration_policy = dynamic_import_and_instantiate_module_from_params(self.ap.exploration)
+
+        # create all the networks of the agent
+        self.networks = self.create_networks()
+
+    @property
+    def phase(self) -> RunPhase:
+        return self._phase
+
+    @phase.setter
+    def phase(self, val: RunPhase) -> None:
+        """
+        Change the phase of the run for the agent and all the sub components
+        :param phase: the new run phase (TRAIN, TEST, etc.)
+        :return: None
+        """
+        self.reset_evaluation_state(val)
+        self._phase = val
+        self.exploration_policy.change_phase(val)
+
+    def reset_evaluation_state(self, val: RunPhase) -> None:
+        starting_evaluation = (val == RunPhase.TEST)
+        ending_evaluation = (self.phase == RunPhase.TEST)
+
+        if starting_evaluation:
+            self.accumulated_rewards_across_evaluation_episodes = 0
+            self.accumulated_shaped_rewards_across_evaluation_episodes = 0
+            self.num_successes_across_evaluation_episodes = 0
+            self.num_evaluation_episodes_completed = 0
+            if self.ap.is_a_highest_level_agent or self.ap.task_parameters.verbosity == "high":
+                screen.log_title("{}: Starting evaluation phase".format(self.name))
+
+        elif ending_evaluation:
+            # we write to the next episode, because it could be that the current episode was already written
+            # to disk and then we won't write it again
+            self.agent_logger.set_current_time(self.current_episode + 1)
+            self.agent_logger.create_signal_value(
+                'Evaluation Reward',
+                self.accumulated_rewards_across_evaluation_episodes / self.num_evaluation_episodes_completed)
+            self.agent_logger.create_signal_value(
+                'Shaped Evaluation Reward',
+                self.accumulated_shaped_rewards_across_evaluation_episodes / self.num_evaluation_episodes_completed)
+            success_rate = self.num_successes_across_evaluation_episodes / self.num_evaluation_episodes_completed
+            self.agent_logger.create_signal_value(
+                "Success Rate",
+                success_rate
+            )
+            if self.ap.is_a_highest_level_agent or self.ap.task_parameters.verbosity == "high":
+                screen.log_title("{}: Finished evaluation phase. Success rate = {}"
+                             .format(self.name, np.round(success_rate, 2)))
+
+    def call_memory(self, func, args=()):
+        """
+        This function is a wrapper to allow having the same calls for shared or unshared memories.
+        It should be used instead of calling the memory directly in order to allow different algorithms to work
+        both with a shared and a local memory.
+        :param func: the name of the memory function to call
+        :param args: the arguments to supply to the function
+        :return: the return value of the function
+        """
+        if self.shared_memory:
+            result = self.shared_memory_scratchpad.internal_call(self.memory_lookup_name, func, args)
+        else:
+            if type(args) != tuple:
+                args = (args,)
+            result = getattr(self.memory, func)(*args)
+        return result
+
+    def log_to_screen(self):
+        # log to screen
+        log = OrderedDict()
+        log["Name"] = self.full_name_id
+        if self.task_id is not None:
+            log["Worker"] = self.task_id
+        log["Episode"] = self.current_episode
+        log["Total reward"] = np.round(self.total_reward_in_current_episode, 2)
+        log["Exploration"] = np.round(self.exploration_policy.get_control_param(), 2)
+        log["Steps"] = self.total_steps_counter
+        log["Training iteration"] = self.training_iteration
+        screen.log_dict(log, prefix=self.phase.value)
+
+    def update_step_in_episode_log(self):
+        """
+        Writes logging messages to screen and updates the log file with all the signal values.
+        :return: None
+        """
+        # log all the signals to file
+        self.agent_episode_logger.set_current_time(self.current_episode_steps_counter)
+        self.agent_episode_logger.create_signal_value('Training Iter', self.training_iteration)
+        self.agent_episode_logger.create_signal_value('In Heatup', int(self._phase == RunPhase.HEATUP))
+        self.agent_episode_logger.create_signal_value('ER #Transitions', self.call_memory('num_transitions'))
+        self.agent_episode_logger.create_signal_value('ER #Episodes', self.call_memory('length'))
+        self.agent_episode_logger.create_signal_value('Total steps', self.total_steps_counter)
+        self.agent_episode_logger.create_signal_value("Epsilon", self.exploration_policy.get_control_param())
+        self.agent_episode_logger.create_signal_value("Shaped Accumulated Reward", self.total_shaped_reward_in_current_episode)
+        self.agent_episode_logger.create_signal_value('Update Target Network', 0, overwrite=False)
+        self.agent_episode_logger.update_wall_clock_time(self.current_episode_steps_counter)
+
+        for signal in self.step_signals:
+            self.agent_episode_logger.create_signal_value(signal.name, signal.get_last_value())
+
+        # dump
+        self.agent_episode_logger.dump_output_csv()
+
+    def update_log(self):
+        """
+        Writes logging messages to screen and updates the log file with all the signal values.
+        :return: None
+        """
+        # log all the signals to file
+        self.agent_logger.set_current_time(self.current_episode)
+        self.agent_logger.create_signal_value('Training Iter', self.training_iteration)
+        self.agent_logger.create_signal_value('In Heatup', int(self._phase == RunPhase.HEATUP))
+        self.agent_logger.create_signal_value('ER #Transitions', self.call_memory('num_transitions'))
+        self.agent_logger.create_signal_value('ER #Episodes', self.call_memory('length'))
+        self.agent_logger.create_signal_value('Episode Length', self.current_episode_steps_counter)
+        self.agent_logger.create_signal_value('Total steps', self.total_steps_counter)
+        self.agent_logger.create_signal_value("Epsilon", np.mean(self.exploration_policy.get_control_param()))
+        self.agent_logger.create_signal_value("Shaped Training Reward", self.total_shaped_reward_in_current_episode
+                                   if self._phase == RunPhase.TRAIN else np.nan)
+        self.agent_logger.create_signal_value("Training Reward", self.total_reward_in_current_episode
+                                   if self._phase == RunPhase.TRAIN else np.nan)
+
+        self.agent_logger.create_signal_value('Update Target Network', 0, overwrite=False)
+        self.agent_logger.update_wall_clock_time(self.current_episode)
+
+        if self._phase != RunPhase.TEST:
+            self.agent_logger.create_signal_value('Evaluation Reward', np.nan, overwrite=False)
+            self.agent_logger.create_signal_value('Shaped Evaluation Reward', np.nan, overwrite=False)
+            self.agent_logger.create_signal_value('Success Rate', np.nan, overwrite=False)
+
+
+        for signal in self.episode_signals:
+            self.agent_logger.create_signal_value("{}/Mean".format(signal.name), signal.get_mean())
+            self.agent_logger.create_signal_value("{}/Stdev".format(signal.name), signal.get_stdev())
+            self.agent_logger.create_signal_value("{}/Max".format(signal.name), signal.get_max())
+            self.agent_logger.create_signal_value("{}/Min".format(signal.name), signal.get_min())
+
+        # dump
+        if self.current_episode % self.ap.visualization.dump_signals_to_csv_every_x_episodes == 0 \
+                and self.current_episode > 0:
+            self.agent_logger.dump_output_csv()
+
+    def handle_episode_ended(self) -> None:
+        """
+        End an episode
+        :return: None
+        """
+        self.current_episode_buffer.is_complete = True
+
+        if self.phase != RunPhase.TEST or self.ap.task_parameters.evaluate_only:
+            self.current_episode += 1
+
+        if self.phase != RunPhase.TEST and isinstance(self.memory, EpisodicExperienceReplay):
+            self.call_memory('store_episode', self.current_episode_buffer)
+
+        if self.phase == RunPhase.TEST:
+            self.accumulated_rewards_across_evaluation_episodes += self.total_reward_in_current_episode
+            self.accumulated_shaped_rewards_across_evaluation_episodes += self.total_shaped_reward_in_current_episode
+            self.num_evaluation_episodes_completed += 1
+
+            if self.spaces.reward.reward_success_threshold and \
+                    self.total_reward_in_current_episode >= self.spaces.reward.reward_success_threshold:
+                self.num_successes_across_evaluation_episodes += 1
+
+        if self.ap.visualization.dump_csv:
+            self.update_log()
+
+        if self.ap.is_a_highest_level_agent or self.ap.task_parameters.verbosity == "high":
+            self.log_to_screen()
+
+    def reset_internal_state(self):
+        """
+        Reset all the episodic parameters
+        :return: None
+        """
+        for signal in self.episode_signals:
+            signal.reset()
+        for signal in self.step_signals:
+            signal.reset()
+        self.agent_episode_logger.set_episode_idx(self.current_episode)
+        self.total_shaped_reward_in_current_episode = 0
+        self.total_reward_in_current_episode = 0
+        self.curr_state = {}
+        self.current_episode_steps_counter = 0
+        self.episode_running_info = {}
+        self.current_episode_buffer = Episode(discount=self.ap.algorithm.discount)
+        if self.exploration_policy:
+            self.exploration_policy.reset()
+        self.input_filter.reset()
+        self.output_filter.reset()
+        self.pre_network_filter.reset()
+        if isinstance(self.memory, EpisodicExperienceReplay):
+            self.call_memory('verify_last_episode_is_closed')
+
+        for network in self.networks.values():
+            network.online_network.reset_internal_memory()
+
+    def learn_from_batch(self, batch) -> Tuple[float, List, List]:
+        """
+        Given a batch of transitions, calculates their target values and updates the network.
+        :param batch: A list of transitions
+        :return: The total loss of the training, the loss per head and the unclipped gradients
+        """
+        return 0, [], []
+
+    def _should_update_online_weights_to_target(self):
+        """
+        Determine if online weights should be copied to the target.
+        :return: boolean: True if the online weights should be copied to the target.
+        """
+        # update the target network of every network that has a target network
+        step_method = self.ap.algorithm.num_steps_between_copying_online_weights_to_target
+        if step_method.__class__ == TrainingSteps:
+            should_update = (self.training_iteration - self.last_target_network_update_step) >= step_method.num_steps
+            if should_update:
+                self.last_target_network_update_step = self.training_iteration
+        elif step_method.__class__ == EnvironmentSteps:
+            should_update = (self.total_steps_counter - self.last_target_network_update_step) >= step_method.num_steps
+            if should_update:
+                self.last_target_network_update_step = self.total_steps_counter
+        else:
+            raise ValueError("The num_steps_between_copying_online_weights_to_target parameter should be either "
+                             "EnvironmentSteps or TrainingSteps. Instead it is {}".format(step_method.__class__))
+        return should_update
+
+    def _should_train(self, wait_for_full_episode=False):
+        """
+        Determine if we should start a training phase according to the number of steps passed since the last training
+        :return:  boolean: True if we should start a training phase
+        """
+        step_method = self.ap.algorithm.num_consecutive_playing_steps
+        if step_method.__class__ == EnvironmentEpisodes:
+            should_update = (self.current_episode - self.last_training_phase_step) >= step_method.num_steps
+            if should_update:
+                self.last_training_phase_step = self.current_episode
+        elif step_method.__class__ == EnvironmentSteps:
+            should_update = (self.total_steps_counter - self.last_training_phase_step) >= step_method.num_steps
+            if wait_for_full_episode:
+                should_update = should_update and self.current_episode_steps_counter == 0
+            if should_update:
+                self.last_training_phase_step = self.total_steps_counter
+        else:
+            raise ValueError("The num_consecutive_playing_steps parameter should be either "
+                             "EnvironmentSteps or Episodes. Instead it is {}".format(step_method.__class__))
+        return should_update
+
+    def train(self):
+        """
+        Check if a training phase should be done as configured by num_consecutive_playing_steps.
+        If it should, then do several training steps as configured by num_consecutive_training_steps.
+        A single training iteration: Sample a batch, train on it and update target networks.
+        :return: The total training loss during the training iterations.
+        """
+        loss = 0
+        if self._should_train():
+            for training_step in range(self.ap.algorithm.num_consecutive_training_steps):
+                # TODO: this should be network dependent
+                network_parameters = list(self.ap.network_wrappers.values())[0]
+
+                # update counters
+                self.training_iteration += 1
+
+                # sample a batch and train on it
+                batch = self.call_memory('sample', network_parameters.batch_size)
+                if self.pre_network_filter is not None:
+                    batch = self.pre_network_filter.filter(batch, update_internal_state=False, deep_copy=False)
+
+                # if the batch returned empty then there are not enough samples in the replay buffer -> skip
+                # training step
+                if len(batch) > 0:
+                    # train
+                    batch = Batch(batch)
+                    total_loss, losses, unclipped_grads = self.learn_from_batch(batch)
+                    loss += total_loss
+                    self.unclipped_grads.add_sample(unclipped_grads)
+
+                    # TODO: the learning rate decay should be done through the network instead of here
+                    # decay learning rate
+                    if network_parameters.learning_rate_decay_rate != 0:
+                        self.curr_learning_rate.add_sample(self.networks['main'].sess.run(
+                            self.networks['main'].online_network.current_learning_rate))
+                    else:
+                        self.curr_learning_rate.add_sample(network_parameters.learning_rate)
+
+                    if any([network.has_target for network in self.networks.values()]) \
+                            and self._should_update_online_weights_to_target():
+                        for network in self.networks.values():
+                            network.update_target_network(self.ap.algorithm.rate_for_copying_weights_to_target)
+
+                        self.agent_logger.create_signal_value('Update Target Network', 1)
+                    else:
+                        self.agent_logger.create_signal_value('Update Target Network', 0, overwrite=False)
+
+                    self.loss.add_sample(loss)
+
+                    if self.imitation:
+                        self.log_to_screen()
+
+            # run additional commands after the training is done
+            self.post_training_commands()
+
+        return loss
+
+    def choose_action(self, curr_state):
+        """
+        choose an action to act with in the current episode being played. Different behavior might be exhibited when training
+         or testing.
+
+        :param curr_state: the current state to act upon.
+        :return: chosen action, some action value describing the action (q-value, probability, etc)
+        """
+        pass
+
+    def prepare_batch_for_inference(self, states: Union[Dict[str, np.ndarray], List[Dict[str, np.ndarray]]],
+                                    network_name: str):
+        """
+        convert curr_state into input tensors tensorflow is expecting. i.e. if we have several inputs states, stack all
+        observations together, measurements together, etc.
+        """
+        # convert to batch so we can run it through the network
+        states = force_list(states)
+        batches_dict = {}
+        for key in self.ap.network_wrappers[network_name].input_embedders_parameters.keys():
+            # there are cases (e.g. ddpg) where the state does not contain all the information needed for running
+            # through the network and this has to be added externally (e.g. ddpg where the action needs to be given in
+            # addition to the current_state, so that all the inputs of the network will be filled)
+            if key in states[0].keys():
+                batches_dict[key] = np.array([np.array(state[key]) for state in states])
+
+        return batches_dict
+
+    def act(self) -> ActionInfo:
+        """
+        Given the agents current knowledge, decide on the next action to apply to the environment
+        :return: an action and a dictionary containing any additional info from the action decision process
+        """
+        if self.phase == RunPhase.TRAIN and self.ap.algorithm.num_consecutive_playing_steps.num_steps == 0:
+            # This agent never plays  while training (e.g. behavioral cloning)
+            return None
+
+        # count steps (only when training or if we are in the evaluation worker)
+        if self.phase != RunPhase.TEST or self.ap.task_parameters.evaluate_only:
+            self.total_steps_counter += 1
+        self.current_episode_steps_counter += 1
+
+        # decide on the action
+        if self.phase == RunPhase.HEATUP and not self.ap.algorithm.heatup_using_network_decisions:
+            # random action
+            self.last_action_info = self.spaces.action.sample_with_info()
+        else:
+            # informed action
+            if self.pre_network_filter is not None:
+                # before choosing an action, first use the pre_network_filter to filter out the current state
+                curr_state = self.run_pre_network_filter_for_inference(self.curr_state)
+
+            else:
+                curr_state = self.curr_state
+            self.last_action_info = self.choose_action(curr_state)
+
+        filtered_action_info = self.output_filter.filter(self.last_action_info)
+
+        return filtered_action_info
+
+    def run_pre_network_filter_for_inference(self, state: StateType):
+        dummy_env_response = EnvResponse(next_state=state, reward=0, game_over=False)
+        return self.pre_network_filter.filter(dummy_env_response)[0].next_state
+
+    def get_state_embedding(self, state: dict) -> np.ndarray:
+        """
+        Given a state, get the corresponding state embedding  from the main network
+        :param state: a state dict
+        :return: a numpy embedding vector
+        """
+        # TODO: this won't work anymore
+        # TODO: instead of the state embedding (which contains the goal) we should use the observation embedding
+        embedding = self.networks['main'].online_network.predict(
+            self.prepare_batch_for_inference(state, "main"),
+            outputs=self.networks['main'].online_network.state_embedding)
+        return embedding
+
+    def update_transition_before_adding_to_replay_buffer(self, transition: Transition) -> Transition:
+        """
+        Allows agents to update the transition just before adding it to the replay buffer.
+        Can be useful for agents that want to tweak the reward, termination signal, etc.
+        :param transition: the transition to update
+        :return: the updated transition
+        """
+        return transition
+
+    def observe(self, env_response: EnvResponse) -> bool:
+        """
+        Given a response from the environment, distill the observation from it and store it for later use.
+        The response should be a dictionary containing the performed action, the new observation and measurements,
+        the reward, a game over flag and any additional information necessary.
+        :param env_response: result of call from environment.step(action)
+        :return:
+        """
+
+        # filter the env_response
+        filtered_env_response = self.input_filter.filter(env_response)[0]
+
+        # inject agent collected statistics, if required
+        if self.ap.algorithm.use_accumulated_reward_as_measurement:
+            if 'measurements' in filtered_env_response.next_state:
+                filtered_env_response.next_state['measurements'] = np.append(filtered_env_response.next_state['measurements'],
+                                                                             self.total_shaped_reward_in_current_episode)
+            else:
+                filtered_env_response.next_state['measurements'] = np.array([self.total_shaped_reward_in_current_episode])
+
+        # if we are in the first step in the episode, then we don't have a a next state and a reward and thus no
+        # transition yet, and therefore we don't need to store anything in the memory.
+        # also we did not reach the goal yet.
+        if self.current_episode_steps_counter == 0:
+            # initialize the current state
+            self.curr_state = filtered_env_response.next_state
+            return env_response.game_over
+        else:
+            transition = Transition(state=copy.copy(self.curr_state), action=self.last_action_info.action,
+                                    reward=filtered_env_response.reward, next_state=filtered_env_response.next_state,
+                                    game_over=filtered_env_response.game_over, info=filtered_env_response.info)
+
+            # now that we have formed a basic transition - the next state progresses to be the current state
+            self.curr_state = filtered_env_response.next_state
+
+            # make agent specific changes to the transition if needed
+            transition = self.update_transition_before_adding_to_replay_buffer(transition)
+
+            # merge the intrinsic reward in
+            if self.ap.algorithm.scale_external_reward_by_intrinsic_reward_value:
+                transition.reward = transition.reward * (1 + self.last_action_info.action_intrinsic_reward)
+            else:
+                transition.reward = transition.reward + self.last_action_info.action_intrinsic_reward
+
+            # sum up the total shaped reward
+            self.total_shaped_reward_in_current_episode += transition.reward
+            self.total_reward_in_current_episode += env_response.reward
+            self.shaped_reward.add_sample(transition.reward)
+            self.reward.add_sample(env_response.reward)
+
+            # add action info to transition
+            if type(self.parent).__name__ == 'CompositeAgent':
+                transition.add_info(self.parent.last_action_info.__dict__)
+            else:
+                transition.add_info(self.last_action_info.__dict__)
+
+            # create and store the transition
+            if self.phase in [RunPhase.TRAIN, RunPhase.HEATUP]:
+                # for episodic memories we keep the transitions in a local buffer until the episode is ended.
+                # for regular memories we insert the transitions directly to the memory
+                if isinstance(self.memory, EpisodicExperienceReplay):
+                    self.current_episode_buffer.insert(transition)
+                else:
+                    self.call_memory('store', transition)
+
+            if self.ap.visualization.dump_in_episode_signals:
+                self.update_step_in_episode_log()
+
+            return transition.game_over
+
+    def post_training_commands(self):
+        pass
+
+    def get_predictions(self, states: List[Dict[str, np.ndarray]], prediction_type: PredictionType):
+        """
+        Get a prediction from the agent with regard to the requested prediction_type.
+        If the agent cannot predict this type of prediction_type, or if there is more than possible way to do so,
+        raise a ValueException.
+        :param states:
+        :param prediction_type:
+        :return:
+        """
+
+        predictions = self.networks['main'].online_network.predict_with_prediction_type(
+            # states=self.dict_state_to_batches_dict(states, 'main'), prediction_type=prediction_type)
+            states=states, prediction_type=prediction_type)
+
+        if len(predictions.keys()) != 1:
+            raise ValueError("The network has more than one component {} matching the requested prediction_type {}. ".
+                             format(list(predictions.keys()), prediction_type))
+        return list(predictions.values())[0]
+
+    def set_incoming_directive(self, action: ActionType) -> None:
+        if isinstance(self.in_action_space, GoalsSpace):
+            self.current_hrl_goal = action
+        elif isinstance(self.in_action_space, AttentionActionSpace):
+            self.input_filter.observation_filters['attention'].crop_low = action[0]
+            self.input_filter.observation_filters['attention'].crop_high = action[1]
+            self.output_filter.action_filters['masking'].set_masking(action[0], action[1])
+
+    def save_checkpoint(self, checkpoint_id: int) -> None:
+        """
+        Allows agents to store additional information when saving checkpoints.
+        :param checkpoint_id: the id of the checkpoint
+        :return: None
+        """
+        pass
+
+    def sync(self) -> None:
+        """
+        Sync the global network parameters to local networks
+        :return: None
+        """
+        for network in self.networks.values():
+            network.sync()
+
+
+
+
+
diff --git a/rl_coach/agents/agent_interface.py b/rl_coach/agents/agent_interface.py
new file mode 100644
index 0000000..cfbd361
--- /dev/null
+++ b/rl_coach/agents/agent_interface.py
@@ -0,0 +1,125 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union, List, Dict
+
+import numpy as np
+
+from rl_coach.core_types import EnvResponse, ActionInfo, RunPhase, PredictionType, ActionType
+
+
+class AgentInterface(object):
+    def __init__(self):
+        self._phase = RunPhase.HEATUP
+        self._parent = None
+        self.spaces = None
+
+    @property
+    def parent(self):
+        """
+        Get the parent class of the agent
+        :return: the current phase
+        """
+        return self._parent
+
+    @parent.setter
+    def parent(self, val):
+        """
+        Change the parent class of the agent
+        :param val: the new parent
+        :return: None
+        """
+        self._parent = val
+
+    @property
+    def phase(self) -> RunPhase:
+        """
+        Get the phase of the agent
+        :return: the current phase
+        """
+        return self._phase
+
+    @phase.setter
+    def phase(self, val: RunPhase):
+        """
+        Change the phase of the agent
+        :param val: the new phase
+        :return: None
+        """
+        self._phase = val
+
+    def reset_internal_state(self) -> None:
+        """
+        Reset the episode parameters for the agent
+        :return: None
+        """
+        raise NotImplementedError("")
+
+    def train(self) -> Union[float, List]:
+        """
+        Train the agents network
+        :return: The loss of the training
+        """
+        raise NotImplementedError("")
+
+    def act(self) -> ActionInfo:
+        """
+        Get a decision of the next action to take.
+        The action is dependent on the current state which the agent holds from resetting the environment or from
+        the observe function.
+        :return: A tuple containing the actual action and additional info on the action
+        """
+        raise NotImplementedError("")
+
+    def observe(self, env_response: EnvResponse) -> bool:
+        """
+        Gets a response from the environment.
+        Processes this information for later use. For example, create a transition and store it in memory.
+        The action info (a class containing any info the agent wants to store regarding its action decision process) is
+        stored by the agent itself when deciding on the action.
+        :param env_response: a EnvResponse containing the response from the environment
+        :return: a done signal which is based on the agent knowledge. This can be different from the done signal from
+                 the environment. For example, an agent can decide to finish the episode each time it gets some
+                 intrinsic reward
+        """
+        raise NotImplementedError("")
+
+    def save_checkpoint(self, checkpoint_id: int) -> None:
+        """
+        Save the model of the agent to the disk. This can contain the network parameters, the memory of the agent, etc.
+        :param checkpoint_id: the checkpoint id to use for saving
+        :return: None
+        """
+        raise NotImplementedError("")
+
+    def get_predictions(self, states: Dict, prediction_type: PredictionType) -> np.ndarray:
+        """
+        Get a prediction from the agent with regard to the requested prediction_type. If the agent cannot predict this
+        type of prediction_type, or if there is more than possible way to do so, raise a ValueException.
+        :param states:
+        :param prediction_type:
+        :return: the agent's prediction
+        """
+        raise NotImplementedError("")
+
+    def set_incoming_directive(self, action: ActionType) -> None:
+        """
+        Pass a higher level command (directive) to the agent.
+        For example, a higher level agent can set the goal of the agent.
+        :param action: the directive to pass to the agent
+        :return: None
+        """
+        raise NotImplementedError("")
diff --git a/rl_coach/agents/bc_agent.py b/rl_coach/agents/bc_agent.py
new file mode 100644
index 0000000..37e1aef
--- /dev/null
+++ b/rl_coach/agents/bc_agent.py
@@ -0,0 +1,81 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.imitation_agent import ImitationAgent
+from rl_coach.architectures.tensorflow_components.heads.policy_head import PolicyHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters
+
+from rl_coach.base_parameters import AgentParameters, AlgorithmParameters, NetworkParameters, InputEmbedderParameters, \
+    MiddlewareScheme
+from rl_coach.exploration_policies.e_greedy import EGreedyParameters
+
+
+class BCAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.collect_new_data = False
+
+
+class BCNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters()}
+        self.middleware_parameters = FCMiddlewareParameters(scheme=MiddlewareScheme.Medium)
+        self.heads_parameters = [PolicyHeadParameters()]
+        self.loss_weights = [1.0]
+        self.optimizer_type = 'Adam'
+        self.batch_size = 32
+        self.replace_mse_with_huber_loss = False
+        self.create_target_network = False
+
+
+class BCAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=BCAlgorithmParameters(),
+                         exploration=EGreedyParameters(),
+                         memory=EpisodicExperienceReplayParameters(),
+                         networks={"main": BCNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.bc_agent:BCAgent'
+
+
+# Behavioral Cloning Agent
+class BCAgent(ImitationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        # When using a policy head, the targets refer to the advantages that we are normally feeding the head with.
+        # In this case, we need the policy head to just predict probabilities, so while we usually train the network
+        # with log(Pi)*Advantages, in this specific case we will train it to log(Pi), which after the softmax will
+        # predict Pi (=probabilities)
+        targets = np.ones(batch.actions().shape[0])
+
+        result = self.networks['main'].train_and_sync_networks({**batch.states(network_keys),
+                                                                'output_0_0': batch.actions()},
+                                                               targets)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
+
diff --git a/rl_coach/agents/bootstrapped_dqn_agent.py b/rl_coach/agents/bootstrapped_dqn_agent.py
new file mode 100644
index 0000000..bef2233
--- /dev/null
+++ b/rl_coach/agents/bootstrapped_dqn_agent.py
@@ -0,0 +1,84 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.dqn_agent import DQNAgentParameters, DQNNetworkParameters
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+
+from rl_coach.exploration_policies.bootstrapped import BootstrappedParameters
+
+
+class BootstrappedDQNNetworkParameters(DQNNetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.num_output_head_copies = 10
+        self.rescale_gradient_from_head_by_factor = [1.0/self.num_output_head_copies]*self.num_output_head_copies
+
+
+class BootstrappedDQNAgentParameters(DQNAgentParameters):
+    def __init__(self):
+        super().__init__()
+        self.network_wrappers = {"main": BootstrappedDQNNetworkParameters()}
+        self.exploration = BootstrappedParameters()
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.bootstrapped_dqn_agent:BootstrappedDQNAgent'
+
+
+# Bootstrapped DQN - https://arxiv.org/pdf/1602.04621.pdf
+class BootstrappedDQNAgent(ValueOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+
+    def reset_internal_state(self):
+        super().reset_internal_state()
+        self.exploration_policy.select_head()
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        next_states_online_values = self.networks['main'].online_network.predict(batch.next_states(network_keys))
+        result = self.networks['main'].parallel_prediction([
+            (self.networks['main'].target_network, batch.next_states(network_keys)),
+            (self.networks['main'].online_network, batch.states(network_keys))
+        ])
+        q_st_plus_1 = result[:self.ap.exploration.architecture_num_q_heads]
+        TD_targets = result[self.ap.exploration.architecture_num_q_heads:]
+
+        # initialize with the current prediction so that we will
+        #  only update the action that we have actually done in this transition
+        for i in range(self.ap.network_wrappers['main'].batch_size):
+            mask = batch[i].info['mask']
+            for head_idx in range(self.ap.exploration.architecture_num_q_heads):
+                if mask[head_idx] == 1:
+                    selected_action = np.argmax(next_states_online_values[head_idx][i], 0)
+                    TD_targets[head_idx][i, batch.actions()[i]] = \
+                        batch.rewards()[i] + (1.0 - batch.game_overs()[i]) * self.ap.algorithm.discount \
+                                     * q_st_plus_1[head_idx][i][selected_action]
+
+        result = self.networks['main'].train_and_sync_networks(batch.states(network_keys), TD_targets)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
+
+    def observe(self, env_response):
+        mask = np.random.binomial(1, self.ap.exploration.bootstrapped_data_sharing_probability,
+                                  self.ap.exploration.architecture_num_q_heads)
+        env_response.info['mask'] = mask
+        return super().observe(env_response)
diff --git a/rl_coach/agents/categorical_dqn_agent.py b/rl_coach/agents/categorical_dqn_agent.py
new file mode 100644
index 0000000..191ec2f
--- /dev/null
+++ b/rl_coach/agents/categorical_dqn_agent.py
@@ -0,0 +1,114 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.dqn_agent import DQNNetworkParameters, DQNAlgorithmParameters
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+from rl_coach.architectures.tensorflow_components.heads.categorical_q_head import CategoricalQHeadParameters
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.memories.non_episodic.experience_replay import ExperienceReplayParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import StateType
+from rl_coach.exploration_policies.e_greedy import EGreedyParameters
+
+
+class CategoricalDQNNetworkParameters(DQNNetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.heads_parameters = [CategoricalQHeadParameters()]
+
+
+class CategoricalDQNAlgorithmParameters(DQNAlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.v_min = -10.0
+        self.v_max = 10.0
+        self.atoms = 51
+
+
+class CategoricalDQNExplorationParameters(EGreedyParameters):
+    def __init__(self):
+        super().__init__()
+        self.epsilon_schedule = LinearSchedule(1, 0.01, 1000000)
+        self.evaluation_epsilon = 0.001
+
+
+class CategoricalDQNAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=CategoricalDQNAlgorithmParameters(),
+                         exploration=CategoricalDQNExplorationParameters(),
+                         memory=ExperienceReplayParameters(),
+                         networks={"main": CategoricalDQNNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.categorical_dqn_agent:CategoricalDQNAgent'
+
+
+# Categorical Deep Q Network - https://arxiv.org/pdf/1707.06887.pdf
+class CategoricalDQNAgent(ValueOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.z_values = np.linspace(self.ap.algorithm.v_min, self.ap.algorithm.v_max, self.ap.algorithm.atoms)
+
+    def distribution_prediction_to_q_values(self, prediction):
+        return np.dot(prediction, self.z_values)
+
+    # prediction's format is (batch,actions,atoms)
+    def get_all_q_values_for_states(self, states: StateType):
+        if self.exploration_policy.requires_action_values():
+            prediction = self.get_prediction(states)
+            q_values = self.distribution_prediction_to_q_values(prediction)
+        else:
+            q_values = None
+        return q_values
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        # for the action we actually took, the error is calculated by the atoms distribution
+        # for all other actions, the error is 0
+        distributed_q_st_plus_1, TD_targets = self.networks['main'].parallel_prediction([
+            (self.networks['main'].target_network, batch.next_states(network_keys)),
+            (self.networks['main'].online_network, batch.states(network_keys))
+        ])
+
+        # only update the action that we have actually done in this transition
+        target_actions = np.argmax(self.distribution_prediction_to_q_values(distributed_q_st_plus_1), axis=1)
+        m = np.zeros((self.ap.network_wrappers['main'].batch_size, self.z_values.size))
+
+        batches = np.arange(self.ap.network_wrappers['main'].batch_size)
+        for j in range(self.z_values.size):
+            tzj = np.fmax(np.fmin(batch.rewards() +
+                                  (1.0 - batch.game_overs()) * self.ap.algorithm.discount * self.z_values[j],
+                                  self.z_values[self.z_values.size - 1]),
+                          self.z_values[0])
+            bj = (tzj - self.z_values[0])/(self.z_values[1] - self.z_values[0])
+            u = (np.ceil(bj)).astype(int)
+            l = (np.floor(bj)).astype(int)
+            m[batches, l] = m[batches, l] + (distributed_q_st_plus_1[batches, target_actions, j] * (u - bj))
+            m[batches, u] = m[batches, u] + (distributed_q_st_plus_1[batches, target_actions, j] * (bj - l))
+        # total_loss = cross entropy between actual result above and predicted result for the given action
+        TD_targets[batches, batch.actions()] = m
+
+        result = self.networks['main'].train_and_sync_networks(batch.states(network_keys), TD_targets)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
+
diff --git a/rl_coach/agents/clipped_ppo_agent.py b/rl_coach/agents/clipped_ppo_agent.py
new file mode 100644
index 0000000..4588035
--- /dev/null
+++ b/rl_coach/agents/clipped_ppo_agent.py
@@ -0,0 +1,277 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+from collections import OrderedDict
+from random import shuffle
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.actor_critic_agent import ActorCriticAgent
+from rl_coach.agents.policy_optimization_agent import PolicyGradientRescaler
+from rl_coach.architectures.tensorflow_components.heads.v_head import VHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import AlgorithmParameters, NetworkParameters, \
+    AgentParameters, InputEmbedderParameters
+from rl_coach.core_types import EnvironmentSteps, Batch, EnvResponse, StateType
+from rl_coach.exploration_policies.additive_noise import AdditiveNoiseParameters
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters
+from rl_coach.schedules import ConstantSchedule
+from rl_coach.spaces import DiscreteActionSpace
+
+from rl_coach.architectures.tensorflow_components.heads.ppo_head import PPOHeadParameters
+from rl_coach.logger import screen
+
+
+class ClippedPPONetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters(activation_function='tanh')}
+        self.middleware_parameters = FCMiddlewareParameters(activation_function='tanh')
+        self.heads_parameters = [VHeadParameters(), PPOHeadParameters()]
+        self.loss_weights = [1.0, 1.0]
+        self.rescale_gradient_from_head_by_factor = [1, 1]
+        self.batch_size = 64
+        self.optimizer_type = 'Adam'
+        self.clip_gradients = None
+        self.use_separate_networks_per_head = True
+        self.async_training = False
+        self.l2_regularization = 0
+        self.create_target_network = True
+        self.shared_optimizer = True
+        self.scale_down_gradients_by_number_of_workers_for_sync_training = True
+
+
+class ClippedPPOAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.num_episodes_in_experience_replay = 1000000
+        self.policy_gradient_rescaler = PolicyGradientRescaler.GAE
+        self.gae_lambda = 0.95
+        self.use_kl_regularization = False
+        self.clip_likelihood_ratio_using_epsilon = 0.2
+        self.estimate_state_value_using_gae = True
+        self.step_until_collecting_full_episodes = True
+        self.beta_entropy = 0.01  # should be 0 for mujoco
+        self.num_consecutive_playing_steps = EnvironmentSteps(2048)
+        self.optimization_epochs = 10
+        self.normalization_stats = None
+        self.clipping_decay_schedule = ConstantSchedule(1)
+
+
+class ClippedPPOAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=ClippedPPOAlgorithmParameters(),
+                         exploration=AdditiveNoiseParameters(),
+                         memory=EpisodicExperienceReplayParameters(),
+                         networks={"main": ClippedPPONetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.clipped_ppo_agent:ClippedPPOAgent'
+
+
+# Clipped Proximal Policy Optimization - https://arxiv.org/abs/1707.06347
+class ClippedPPOAgent(ActorCriticAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        # signals definition
+        self.value_loss = self.register_signal('Value Loss')
+        self.policy_loss = self.register_signal('Policy Loss')
+        self.total_kl_divergence_during_training_process = 0.0
+        self.unclipped_grads = self.register_signal('Grads (unclipped)')
+        self.value_targets = self.register_signal('Value Targets')
+        self.kl_divergence = self.register_signal('KL Divergence')
+        self.likelihood_ratio = self.register_signal('Likelihood Ratio')
+        self.clipped_likelihood_ratio = self.register_signal('Clipped Likelihood Ratio')
+
+
+    def set_session(self, sess):
+        super().set_session(sess)
+        if self.ap.algorithm.normalization_stats is not None:
+            self.ap.algorithm.normalization_stats.set_session(sess)
+
+    def fill_advantages(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        current_state_values = self.networks['main'].online_network.predict(batch.states(network_keys))[0]
+        current_state_values = current_state_values.squeeze()
+        self.state_values.add_sample(current_state_values)
+
+        # calculate advantages
+        advantages = []
+        value_targets = []
+        if self.policy_gradient_rescaler == PolicyGradientRescaler.A_VALUE:
+            advantages = batch.total_returns() - current_state_values
+        elif self.policy_gradient_rescaler == PolicyGradientRescaler.GAE:
+            # get bootstraps
+            episode_start_idx = 0
+            advantages = np.array([])
+            value_targets = np.array([])
+            for idx, game_over in enumerate(batch.game_overs()):
+                if game_over:
+                    # get advantages for the rollout
+                    value_bootstrapping = np.zeros((1,))
+                    rollout_state_values = np.append(current_state_values[episode_start_idx:idx+1], value_bootstrapping)
+
+                    rollout_advantages, gae_based_value_targets = \
+                        self.get_general_advantage_estimation_values(batch.rewards()[episode_start_idx:idx+1],
+                                                                     rollout_state_values)
+                    episode_start_idx = idx + 1
+                    advantages = np.append(advantages, rollout_advantages)
+                    value_targets = np.append(value_targets, gae_based_value_targets)
+        else:
+            screen.warning("WARNING: The requested policy gradient rescaler is not available")
+
+        # standardize
+        advantages = (advantages - np.mean(advantages)) / np.std(advantages)
+
+        for transition, advantage, value_target in zip(batch.transitions, advantages, value_targets):
+            transition.info['advantage'] = advantage
+            transition.info['gae_based_value_target'] = value_target
+
+        self.action_advantages.add_sample(advantages)
+
+    def train_network(self, batch, epochs):
+        batch_results = []
+        for j in range(epochs):
+            batch.shuffle()
+            batch_results = {
+                'total_loss': [],
+                'losses': [],
+                'unclipped_grads': [],
+                'kl_divergence': [],
+                'entropy': []
+            }
+
+            fetches = [self.networks['main'].online_network.output_heads[1].kl_divergence,
+                       self.networks['main'].online_network.output_heads[1].entropy,
+                       self.networks['main'].online_network.output_heads[1].likelihood_ratio,
+                       self.networks['main'].online_network.output_heads[1].clipped_likelihood_ratio]
+
+            for i in range(int(batch.size / self.ap.network_wrappers['main'].batch_size)):
+                start = i * self.ap.network_wrappers['main'].batch_size
+                end = (i + 1) * self.ap.network_wrappers['main'].batch_size
+
+                network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+                actions = batch.actions()[start:end]
+                gae_based_value_targets = batch.info('gae_based_value_target')[start:end]
+                if not isinstance(self.spaces.action, DiscreteActionSpace) and len(actions.shape) == 1:
+                    actions = np.expand_dims(actions, -1)
+
+                # get old policy probabilities and distribution
+
+                # TODO-perf - the target network ("old_policy") is not changing. this can be calculated once for all epochs.
+                # the shuffling being done, should only be performed on the indices.
+                result = self.networks['main'].target_network.predict({k: v[start:end] for k, v in batch.states(network_keys).items()})
+                old_policy_distribution = result[1:]
+
+                # calculate gradients and apply on both the local policy network and on the global policy network
+                if self.ap.algorithm.estimate_state_value_using_gae:
+                    value_targets = np.expand_dims(gae_based_value_targets, -1)
+                else:
+                    value_targets = batch.total_returns(expand_dims=True)[start:end]
+
+                inputs = copy.copy({k: v[start:end] for k, v in batch.states(network_keys).items()})
+                inputs['output_1_0'] = actions
+
+                # The old_policy_distribution needs to be represented as a list, because in the event of
+                # discrete controls, it has just a mean. otherwise, it has both a mean and standard deviation
+                for input_index, input in enumerate(old_policy_distribution):
+                    inputs['output_1_{}'.format(input_index + 1)] = input
+
+                inputs['output_1_3'] = self.ap.algorithm.clipping_decay_schedule.current_value
+
+                total_loss, losses, unclipped_grads, fetch_result = \
+                    self.networks['main'].train_and_sync_networks(
+                        inputs, [value_targets, batch.info('advantage')[start:end]], additional_fetches=fetches
+                    )
+
+                batch_results['total_loss'].append(total_loss)
+                batch_results['losses'].append(losses)
+                batch_results['unclipped_grads'].append(unclipped_grads)
+                batch_results['kl_divergence'].append(fetch_result[0])
+                batch_results['entropy'].append(fetch_result[1])
+
+                self.unclipped_grads.add_sample(unclipped_grads)
+                self.value_targets.add_sample(value_targets)
+                self.likelihood_ratio.add_sample(fetch_result[2])
+                self.clipped_likelihood_ratio.add_sample(fetch_result[3])
+
+            for key in batch_results.keys():
+                batch_results[key] = np.mean(batch_results[key], 0)
+
+            self.value_loss.add_sample(batch_results['losses'][0])
+            self.policy_loss.add_sample(batch_results['losses'][1])
+
+            if self.ap.network_wrappers['main'].learning_rate_decay_rate != 0:
+                curr_learning_rate = self.networks['main'].online_network.get_variable_value(
+                    self.networks['main'].online_network.adaptive_learning_rate_scheme)
+                self.curr_learning_rate.add_sample(curr_learning_rate)
+            else:
+                curr_learning_rate = self.ap.network_wrappers['main'].learning_rate
+
+            # log training parameters
+            screen.log_dict(
+                OrderedDict([
+                    ("Surrogate loss", batch_results['losses'][1]),
+                    ("KL divergence", batch_results['kl_divergence']),
+                    ("Entropy", batch_results['entropy']),
+                    ("training epoch", j),
+                    ("learning_rate", curr_learning_rate)
+                ]),
+                prefix="Policy training"
+            )
+
+        self.total_kl_divergence_during_training_process = batch_results['kl_divergence']
+        self.entropy.add_sample(batch_results['entropy'])
+        self.kl_divergence.add_sample(batch_results['kl_divergence'])
+        return batch_results['losses']
+
+    def post_training_commands(self):
+        # clean memory
+        self.call_memory('clean')
+
+    def train(self):
+        if self._should_train(wait_for_full_episode=True):
+            dataset = self.memory.transitions
+            dataset = self.pre_network_filter.filter(dataset, deep_copy=False)
+            batch = Batch(dataset)
+
+            for training_step in range(self.ap.algorithm.num_consecutive_training_steps):
+                self.networks['main'].sync()
+                self.fill_advantages(batch)
+
+                # take only the requested number of steps
+                if isinstance(self.ap.algorithm.num_consecutive_playing_steps, EnvironmentSteps):
+                    dataset = dataset[:self.ap.algorithm.num_consecutive_playing_steps.num_steps]
+                shuffle(dataset)
+                batch = Batch(dataset)
+
+                self.train_network(batch, self.ap.algorithm.optimization_epochs)
+
+            self.post_training_commands()
+            self.training_iteration += 1
+            # self.update_log()  # should be done in order to update the data that has been accumulated * while not playing *
+            return None
+
+    def run_pre_network_filter_for_inference(self, state: StateType):
+        dummy_env_response = EnvResponse(next_state=state, reward=0, game_over=False)
+        return self.pre_network_filter.filter(dummy_env_response, update_internal_state=False)[0].next_state
+
+    def choose_action(self, curr_state):
+        self.ap.algorithm.clipping_decay_schedule.step()
+        return super().choose_action(curr_state)
diff --git a/rl_coach/agents/composite_agent.py b/rl_coach/agents/composite_agent.py
new file mode 100644
index 0000000..d44246f
--- /dev/null
+++ b/rl_coach/agents/composite_agent.py
@@ -0,0 +1,415 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+import itertools
+from enum import Enum
+from typing import Union, List, Dict
+
+import numpy as np
+from rl_coach.agents.agent_interface import AgentInterface
+from rl_coach.base_parameters import AgentParameters, VisualizationParameters
+# from rl_coach.environments.environment_interface import ActionSpace
+from rl_coach.spaces import ActionSpace
+from rl_coach.spaces import AgentSelection, AttentionActionSpace, ObservationSpace, SpacesDefinition
+from rl_coach.utils import short_dynamic_import
+
+from rl_coach.core_types import ActionInfo, EnvResponse, ActionType, RunPhase
+from rl_coach.filters.observation.observation_crop_filter import ObservationCropFilter
+
+
+class DecisionPolicy(object):
+    def choose_action(self, actions_info: Dict[str, ActionInfo]) -> ActionInfo:
+        """
+        Given a list of actions from multiple agents, decide on a single action to take.
+        :param actions_info: a dictionary of agent names and their corresponding
+                             ActionInfo instances containing information for each agents action
+        :return: a single action and the corresponding action info
+        """
+        raise NotImplementedError("")
+
+
+class SingleDecider(DecisionPolicy):
+    """
+    A decision policy that chooses the action according to the agent that is currently in control.
+    """
+    def __init__(self, default_decision_maker: str):
+        super().__init__()
+        self._decision_maker = default_decision_maker
+
+    @property
+    def decision_maker(self):
+        """
+        Get the decision maker that was set by the upper level control.
+        """
+        return self._decision_maker
+
+    @decision_maker.setter
+    def decision_maker(self, decision_maker: str):
+        """
+        Set the decision maker by the upper level control.
+        :param action: the incoming action from the upper level control.
+        """
+        self._decision_maker = decision_maker
+
+    def choose_action(self, actions_info: Dict[str, ActionInfo]) -> ActionInfo:
+        """
+        Given a list of actions from multiple agents, take the action of the current decision maker
+        :param actions_info: a list of ActionInfo instances containing the information for each agents action
+        :return: a single action
+        """
+        if self.decision_maker not in actions_info.keys():
+            raise ValueError("The current decision maker ({}) does not exist in the given actions ({})"
+                             .format(self.decision_maker, actions_info.keys()))
+        return actions_info[self.decision_maker]
+
+
+class RoundRobin(DecisionPolicy):
+    """
+    A decision policy that chooses the action according to agents selected in a circular order.
+    """
+    def __init__(self, num_agents: int):
+        super().__init__()
+        self.round_robin = itertools.cycle(range(num_agents))
+
+    def choose_action(self, actions_info: Dict[str, ActionInfo]) -> ActionInfo:
+        """
+        Given a list of actions from multiple agents, take the action of the current decision maker, which is set in a
+         circular order
+        :param actions_info: a list of ActionInfo instances containing the information for each agents action
+        :return: a single action
+        """
+        decision_maker = self.round_robin.__next__()
+        if decision_maker not in range(len(actions_info.keys())):
+            raise ValueError("The size of action_info does not match the number of agents set to RoundRobin decision"
+                             " policy.")
+        return actions_info.items()[decision_maker]
+
+
+class MajorityVote(DecisionPolicy):
+    """
+    A decision policy that chooses the action that most of the agents chose.
+    This policy is only useful for discrete control.
+    """
+    def __init__(self):
+        super().__init__()
+
+    def choose_action(self, actions_info: Dict[str, ActionInfo]) -> ActionInfo:
+        """
+        Given a list of actions from multiple agents, take the action that most agents agree on
+        :param actions_info: a list of ActionInfo instances containing the information for each agents action
+        :return: a single action
+        """
+        # TODO: enforce discrete action spaces
+        if len(actions_info.keys()) == 0:
+            raise ValueError("The given list of actions is empty")
+        vote_count = np.bincount([action_info.action for action_info in actions_info.values()])
+        majority_vote = np.argmax(vote_count)
+        return actions_info.items()[majority_vote]
+
+
+class MeanDecision(DecisionPolicy):
+    """
+    A decision policy that takes the mean action given the actions of all the agents.
+    This policy is only useful for continuous control.
+    """
+    def __init__(self):
+        super().__init__()
+
+    def choose_action(self, actions_info: Dict[str, ActionInfo]) -> ActionInfo:
+        """
+        Given a list of actions from multiple agents, take the mean action
+        :param actions_info: a list of ActionInfo instances containing the information for each agents action
+        :return: a single action
+        """
+        # TODO: enforce continuous action spaces
+        if len(actions_info.keys()) == 0:
+            raise ValueError("The given list of actions is empty")
+        mean = np.mean([action_info.action for action_info in actions_info.values()], axis=0)
+        return ActionInfo(mean)
+
+
+class RewardPolicy(Enum):
+    ReachingGoal = 0
+    NativeEnvironmentReward = 1
+    AccumulatedEnvironmentRewards = 2
+
+
+class CompositeAgent(AgentInterface):
+    """
+    A CompositeAgent is a group of agents in the same hierarchy level.
+    In a CompositeAgent, each agent may take the role of either a controller or an observer.
+    Each agent that is defined as observer, gets observations from the environment.
+    Each agent that is defined as controller, can potentially also control the environment, in addition to observing it.
+    There are several ways to decide on the action from different controller agents:
+    1. Ensemble -
+        - Take the majority vote (discrete controls)
+        - Take the mean action (continuous controls)
+        - Round robin between the agents (discrete/continuous)
+    2. Skills -
+        - At each step a single agent decides (Chosen by the uppoer hierarchy controlling agent)
+
+    A CompositeAgent can be controlled using one of the following methods (ActionSpaces):
+    1. Goals (in terms of measurements, observation, embedding or a change in those values)
+    2. Agent Selection (skills) / Discrete action space.
+    3. Attention (a subset of the real environment observation / action space)
+    """
+    def __init__(self,
+                 agents_parameters: Union[AgentParameters, Dict[str, AgentParameters]],
+                 visualization_parameters: VisualizationParameters,
+                 decision_policy: DecisionPolicy,
+                 out_action_space: ActionSpace,
+                 in_action_space: Union[None, ActionSpace]=None,
+                 decision_makers: Union[bool, Dict[str, bool]]=True,
+                 reward_policy: RewardPolicy=RewardPolicy.NativeEnvironmentReward,
+                 name="CompositeAgent"):
+        """
+        Construct an agent group
+        :param agents_parameters: a list of presets describing each one of the agents in the group
+        :param decision_policy: the decision policy of the group which describes how actions are consolidated
+        :param out_action_space: the type of action space that is used by this composite agent in order to control the
+                                 underlying environment
+        :param in_action_space: the type of action space that is used by the upper level agent in order to control this
+                                group
+        :param decision_makers: a list of booleans representing for each corresponding agent if it has a decision
+                                privilege or if it is just an observer
+        :param reward_policy: the type of the reward that the group receives
+        """
+        super().__init__()
+
+        if isinstance(agents_parameters, AgentParameters):
+            decision_makers = {agents_parameters.name: True}
+            agents_parameters = {agents_parameters.name: agents_parameters}
+        self.agents_parameters = agents_parameters
+        self.visualization_parameters = visualization_parameters
+        self.decision_makers = decision_makers
+        self.decision_policy = decision_policy
+        self.in_action_space = in_action_space
+        self.out_action_space = out_action_space  # TODO: this is not being used
+        self.reward_policy = reward_policy
+        self.full_name_id = self.name = name
+        self.current_decision_maker = 0
+        self.environment = None
+        self.agents = {}  # key = agent_name, value = agent
+        self.incoming_action = None
+        self.last_state = None
+        self._phase = RunPhase.HEATUP
+        self.last_action_info = None
+        self.current_episode = 0
+        self.parent_level_manager = None
+
+        # environment spaces
+        self.spaces = None
+
+        # counters for logging
+        self.total_steps_counter = 0
+        self.current_episode_steps_counter = 0
+        self.total_reward_in_current_episode = 0
+
+        # validate input
+        if set(self.decision_makers) != set(self.agents_parameters):
+            raise ValueError("The decision_makers dictionary keys does not match the names of the given agents")
+        if sum(self.decision_makers.values()) > 1 and type(self.decision_policy) == SingleDecider \
+                and type(self.in_action_space) != AgentSelection:
+            raise ValueError("When the control policy is set to single decider, the master policy should control the"
+                             "agent group via agent selection (ControlType.AgentSelection)")
+
+    @property
+    def parent(self):
+        """
+        Get the parent class of the composite agent
+        :return: the current phase
+        """
+        return self._parent
+
+    @parent.setter
+    def parent(self, val):
+        """
+        Change the parent class of the composite agent.
+        Additionally, updates the full name of the agent
+        :param val: the new parent
+        :return: None
+        """
+        self._parent = val
+        if not hasattr(self._parent, 'name'):
+            raise ValueError("The parent of a composite agent must have a name")
+        self.full_name_id = "{}/{}".format(self._parent.name, self.name)
+
+    def create_agents(self):
+        for agent_name, agent_parameters in self.agents_parameters.items():
+            agent_parameters.name = agent_name
+
+            # create agent
+            self.agents[agent_parameters.name] = short_dynamic_import(agent_parameters.path)(agent_parameters,
+                                                                                             parent=self)
+            self.agents[agent_parameters.name].parent_level_manager = self.parent_level_manager
+
+        # TODO: this is a bit too specific to be defined here
+        # add an attention cropping filter if the incoming directives are attention boxes
+        if isinstance(self.in_action_space, AttentionActionSpace):
+            attention_size = self.in_action_space.forced_attention_size
+            for agent in self.agents.values():
+                agent.input_filter.observation_filters['attention'] = \
+                    ObservationCropFilter(crop_low=np.zeros_like(attention_size), crop_high=attention_size)
+                agent.input_filter.observation_filters.move_to_end('attention', last=False)  # add the cropping at the beginning
+
+    def setup_logger(self) -> None:
+        """
+        Setup the logger for all the agents in the composite agent
+        :return: None
+        """
+        [agent.setup_logger() for agent in self.agents.values()]
+
+    def set_session(self, sess) -> None:
+        """
+        Set the deep learning framework session for all the agents in the composite agent
+        :return: None
+        """
+        [agent.set_session(sess) for agent in self.agents.values()]
+
+    def set_environment_parameters(self, spaces: SpacesDefinition):
+        """
+        Sets the parameters that are environment dependent. As a side effect, initializes all the components that are
+        dependent on those values, by calling init_environment_dependent_modules
+        :param spaces: the definitions of all the spaces of the environment
+        :return: None
+        """
+        self.spaces = copy.deepcopy(spaces)
+        [agent.set_environment_parameters(self.spaces) for agent in self.agents.values()]
+
+    @property
+    def phase(self):
+        return self._phase
+
+    @phase.setter
+    def phase(self, val: RunPhase) -> None:
+        """
+        Change the current phase of all the agents in the group
+        :param phase: the new phase
+        :return: None
+        """
+        self._phase = val
+        for agent in self.agents.values():
+            agent.phase = val
+
+    def end_episode(self) -> None:
+        """
+        End an episode
+        :return: None
+        """
+        self.current_episode += 1
+        [agent.handle_episode_ended() for agent in self.agents.values()]
+
+    def reset_internal_state(self) -> None:
+        """
+        Reset the episode for all the agents in the group
+        :return: None
+        """
+        # update counters
+        self.total_steps_counter = 0
+        self.current_episode_steps_counter = 0
+        self.total_reward_in_current_episode = 0
+
+        # reset all sub modules
+        [agent.reset_internal_state() for agent in self.agents.values()]
+
+    def train(self) -> Union[float, List]:
+        """
+        Make a single training step for all the agents of the group
+        :return: a list of loss values from the training step
+        """
+        return [agent.train() for agent in self.agents.values()]
+
+    def act(self) -> ActionInfo:
+        """
+        Get the actions from all the agents in the group. Then use the decision policy in order to
+        extract a single action out of the list of actions.
+        :return: the chosen action and its corresponding information
+        """
+
+        # update counters
+        self.total_steps_counter += 1
+        self.current_episode_steps_counter += 1
+
+        # get the actions info from all the agents
+        actions_info = {}
+        for agent_name, agent in self.agents.items():
+            action_info = agent.act()
+            actions_info[agent_name] = action_info
+
+        # decide on a single action to apply to the environment
+        action_info = self.decision_policy.choose_action(actions_info)
+
+        # TODO: make the last action info a property?
+        # pass the action info to all the observers
+        for agent_name, is_decision_maker in self.decision_makers.items():
+            if not is_decision_maker:
+                self.agents[agent_name].last_action_info = action_info
+        self.last_action_info = action_info
+
+        return self.last_action_info
+
+    def observe(self, env_response: EnvResponse) -> bool:
+        """
+        Given a response from the environment as a env_response, filter it and pass it to the agents.
+        This method has two main jobs:
+        1. Wrap the previous transition, ending with the new observation coming from EnvResponse.
+        2. Save the next_state as the current_state to take action upon for the next call to act().
+
+        :param env_response:
+        :param action_info: additional info about the chosen action
+        :return:
+        """
+
+        # accumulate the unfiltered rewards for visualization
+        self.total_reward_in_current_episode += env_response.reward
+
+        episode_ended = env_response.game_over
+
+        # pass the env_response to all the sub-agents
+        # TODO: what if one agent decides to end the episode but the others don't? who decides?
+        for agent_name, agent in self.agents.items():
+            goal_reached = agent.observe(env_response)
+            episode_ended = episode_ended or goal_reached
+
+        # TODO: unlike for a single agent, here we also treat a game over by the environment.
+        # probably better to only return the agents' goal_reached decisions.
+        return episode_ended
+
+    def save_checkpoint(self, checkpoint_id: int) -> None:
+        [agent.save_checkpoint(checkpoint_id) for agent in self.agents.values()]
+
+    def set_incoming_directive(self, action: ActionType) -> None:
+        self.incoming_action = action
+        if isinstance(self.decision_policy, SingleDecider) and isinstance(self.in_action_space, AgentSelection):
+            self.decision_policy.decision_maker = list(self.agents.keys())[action]
+        if isinstance(self.in_action_space, AttentionActionSpace):
+            # TODO: redesign to be more modular
+            for agent in self.agents.values():
+                agent.input_filter.observation_filters['attention'].crop_low = action[0]
+                agent.input_filter.observation_filters['attention'].crop_high = action[1]
+                agent.output_filter.action_filters['masking'].set_masking(action[0], action[1])
+
+        # TODO  rethink this scheme. we don't want so many if else clauses lying around here.  
+        # TODO - for incoming actions which do not involve setting the acting agent we should change the
+        #  observation_space, goal to pursue, etc accordingly to the incoming action.
+
+    def sync(self) -> None:
+        """
+        Sync the agent networks with the global network
+        :return:
+        """
+        [agent.sync() for agent in self.agents.values()]
diff --git a/rl_coach/agents/ddpg_agent.py b/rl_coach/agents/ddpg_agent.py
new file mode 100644
index 0000000..01a99c1
--- /dev/null
+++ b/rl_coach/agents/ddpg_agent.py
@@ -0,0 +1,192 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.actor_critic_agent import ActorCriticAgent
+from rl_coach.agents.agent import Agent
+from rl_coach.architectures.tensorflow_components.heads.v_head import VHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import NetworkParameters, AlgorithmParameters, \
+    AgentParameters, InputEmbedderParameters, EmbedderScheme
+from rl_coach.exploration_policies.ou_process import OUProcessParameters
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters
+from rl_coach.spaces import BoxActionSpace, GoalsSpace
+
+from rl_coach.architectures.tensorflow_components.heads.ddpg_actor_head import DDPGActorHeadParameters
+from rl_coach.core_types import ActionInfo, EnvironmentSteps
+
+
+class DDPGCriticNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters(batchnorm=True),
+                                            'action': InputEmbedderParameters(scheme=EmbedderScheme.Shallow)}
+        self.middleware_parameters = FCMiddlewareParameters()
+        self.heads_parameters = [VHeadParameters()]
+        self.loss_weights = [1.0]
+        self.rescale_gradient_from_head_by_factor = [1]
+        self.optimizer_type = 'Adam'
+        self.batch_size = 64
+        self.async_training = False
+        self.learning_rate = 0.001
+        self.create_target_network = True
+        self.shared_optimizer = True
+        self.scale_down_gradients_by_number_of_workers_for_sync_training = False
+
+
+class DDPGActorNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters(batchnorm=True)}
+        self.middleware_parameters = FCMiddlewareParameters(batchnorm=True)
+        self.heads_parameters = [DDPGActorHeadParameters()]
+        self.loss_weights = [1.0]
+        self.rescale_gradient_from_head_by_factor = [1]
+        self.optimizer_type = 'Adam'
+        self.batch_size = 64
+        self.async_training = False
+        self.learning_rate = 0.0001
+        self.create_target_network = True
+        self.shared_optimizer = True
+        self.scale_down_gradients_by_number_of_workers_for_sync_training = False
+
+
+class DDPGAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(1)
+        self.rate_for_copying_weights_to_target = 0.001
+        self.num_consecutive_playing_steps = EnvironmentSteps(1)
+        self.use_target_network_for_evaluation = False
+        self.action_penalty = 0
+        self.clip_critic_targets = None  # expected to be a tuple of the form (min_clip_value, max_clip_value) or None
+        self.use_non_zero_discount_for_terminal_states = False
+
+
+class DDPGAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=DDPGAlgorithmParameters(),
+                         exploration=OUProcessParameters(),
+                         memory=EpisodicExperienceReplayParameters(),
+                         networks={"actor": DDPGActorNetworkParameters(),
+                                    "critic": DDPGCriticNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.ddpg_agent:DDPGAgent'
+
+
+# Deep Deterministic Policy Gradients Network - https://arxiv.org/pdf/1509.02971.pdf
+class DDPGAgent(ActorCriticAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+
+        self.q_values = self.register_signal("Q")
+        self.TD_targets_signal = self.register_signal("TD targets")
+        self.action_signal = self.register_signal("actions")
+
+    def learn_from_batch(self, batch):
+        actor = self.networks['actor']
+        critic = self.networks['critic']
+
+        actor_keys = self.ap.network_wrappers['actor'].input_embedders_parameters.keys()
+        critic_keys = self.ap.network_wrappers['critic'].input_embedders_parameters.keys()
+
+        # TD error = r + discount*max(q_st_plus_1) - q_st
+        next_actions, actions_mean = actor.parallel_prediction([
+            (actor.target_network, batch.next_states(actor_keys)),
+            (actor.online_network, batch.states(actor_keys))
+        ])
+
+        critic_inputs = copy.copy(batch.next_states(critic_keys))
+        critic_inputs['action'] = next_actions
+        q_st_plus_1 = critic.target_network.predict(critic_inputs)
+
+        # calculate the bootstrapped TD targets while discounting terminal states according to
+        # use_non_zero_discount_for_terminal_states
+        if self.ap.algorithm.use_non_zero_discount_for_terminal_states:
+            TD_targets = batch.rewards(expand_dims=True) + self.ap.algorithm.discount * q_st_plus_1
+        else:
+            TD_targets = batch.rewards(expand_dims=True) + \
+                         (1.0 - batch.game_overs(expand_dims=True)) * self.ap.algorithm.discount * q_st_plus_1
+
+        # clip the TD targets to prevent overestimation errors
+        if self.ap.algorithm.clip_critic_targets:
+            TD_targets = np.clip(TD_targets, *self.ap.algorithm.clip_critic_targets)
+
+        self.TD_targets_signal.add_sample(TD_targets)
+
+        # get the gradients of the critic output with respect to the action
+        critic_inputs = copy.copy(batch.states(critic_keys))
+        critic_inputs['action'] = actions_mean
+        action_gradients = critic.online_network.predict(critic_inputs,
+                                                         outputs=critic.online_network.gradients_wrt_inputs[0]['action'])
+
+        # train the critic
+        critic_inputs = copy.copy(batch.states(critic_keys))
+        critic_inputs['action'] = batch.actions(len(batch.actions().shape) == 1)
+        result = critic.train_and_sync_networks(critic_inputs, TD_targets)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        # apply the gradients from the critic to the actor
+        initial_feed_dict = {actor.online_network.gradients_weights_ph[0]: -action_gradients}
+        gradients = actor.online_network.predict(batch.states(actor_keys),
+                                                 outputs=actor.online_network.weighted_gradients[0],
+                                                 initial_feed_dict=initial_feed_dict)
+
+        if actor.has_global:
+            actor.apply_gradients_to_global_network(gradients)
+            actor.update_online_network()
+        else:
+            actor.apply_gradients_to_online_network(gradients)
+
+        return total_loss, losses, unclipped_grads
+
+    def train(self):
+        return Agent.train(self)
+
+    def choose_action(self, curr_state):
+        if not (isinstance(self.spaces.action, BoxActionSpace) or isinstance(self.spaces.action, GoalsSpace)):
+            raise ValueError("DDPG works only for continuous control problems")
+        # convert to batch so we can run it through the network
+        tf_input_state = self.prepare_batch_for_inference(curr_state, 'actor')
+        if self.ap.algorithm.use_target_network_for_evaluation:
+            actor_network = self.networks['actor'].target_network
+        else:
+            actor_network = self.networks['actor'].online_network
+
+        action_values = actor_network.predict(tf_input_state).squeeze()
+
+        action = self.exploration_policy.get_action(action_values)
+
+        self.action_signal.add_sample(action)
+
+        # get q value
+        tf_input_state = self.prepare_batch_for_inference(curr_state, 'critic')
+        action_batch = np.expand_dims(action, 0)
+        if type(action) != np.ndarray:
+            action_batch = np.array([[action]])
+        tf_input_state['action'] = action_batch
+        q_value = self.networks['critic'].online_network.predict(tf_input_state)[0]
+        self.q_values.add_sample(q_value)
+
+        action_info = ActionInfo(action=action,
+                                 action_value=q_value)
+
+        return action_info
diff --git a/rl_coach/agents/ddqn_agent.py b/rl_coach/agents/ddqn_agent.py
new file mode 100644
index 0000000..3c93f12
--- /dev/null
+++ b/rl_coach/agents/ddqn_agent.py
@@ -0,0 +1,69 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+from rl_coach.core_types import EnvironmentSteps
+
+
+class DDQNAgentParameters(DQNAgentParameters):
+    def __init__(self):
+        super().__init__()
+        self.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(30000)
+        self.exploration.epsilon_schedule = LinearSchedule(1, 0.01, 1000000)
+        self.exploration.evaluation_epsilon = 0.001
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.ddqn_agent:DDQNAgent'
+
+
+# Double DQN - https://arxiv.org/abs/1509.06461
+class DDQNAgent(ValueOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        selected_actions = np.argmax(self.networks['main'].online_network.predict(batch.next_states(network_keys)), 1)
+        q_st_plus_1, TD_targets = self.networks['main'].parallel_prediction([
+            (self.networks['main'].target_network, batch.next_states(network_keys)),
+            (self.networks['main'].online_network, batch.states(network_keys))
+        ])
+
+        # initialize with the current prediction so that we will
+        #  only update the action that we have actually done in this transition
+        TD_errors = []
+        for i in range(self.ap.network_wrappers['main'].batch_size):
+            new_target = batch.rewards()[i] + \
+                         (1.0 - batch.game_overs()[i]) * self.ap.algorithm.discount * q_st_plus_1[i][selected_actions[i]]
+            TD_errors.append(np.abs(new_target - TD_targets[i, batch.actions()[i]]))
+            TD_targets[i, batch.actions()[i]] = new_target
+
+        # update errors in prioritized replay buffer
+        importance_weights = self.update_transition_priorities_and_get_weights(TD_errors, batch)
+
+        result = self.networks['main'].train_and_sync_networks(batch.states(network_keys), TD_targets,
+                                                               importance_weights=importance_weights)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
diff --git a/rl_coach/agents/dfp_agent.py b/rl_coach/agents/dfp_agent.py
new file mode 100644
index 0000000..d5bd4c7
--- /dev/null
+++ b/rl_coach/agents/dfp_agent.py
@@ -0,0 +1,219 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+from enum import Enum
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.agent import Agent
+from rl_coach.architectures.tensorflow_components.architecture import Conv2d, Dense
+from rl_coach.architectures.tensorflow_components.heads.measurements_prediction_head import MeasurementsPredictionHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import AlgorithmParameters, AgentParameters, NetworkParameters, \
+    InputEmbedderParameters, MiddlewareScheme
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.spaces import SpacesDefinition, VectorObservationSpace
+
+from rl_coach.core_types import ActionInfo, EnvironmentSteps, RunPhase
+from rl_coach.exploration_policies.e_greedy import EGreedyParameters
+
+
+class HandlingTargetsAfterEpisodeEnd(Enum):
+    LastStep = 0
+    NAN = 1
+
+
+class DFPNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters(activation_function='leaky_relu'),
+                                            'measurements': InputEmbedderParameters(activation_function='leaky_relu'),
+                                            'goal': InputEmbedderParameters(activation_function='leaky_relu')}
+
+        self.input_embedders_parameters['observation'].scheme = [
+            Conv2d([32, 8, 4]),
+            Conv2d([64, 4, 2]),
+            Conv2d([64, 3, 1]),
+            Dense([512]),
+        ]
+
+        self.input_embedders_parameters['measurements'].scheme = [
+            Dense([128]),
+            Dense([128]),
+            Dense([128]),
+        ]
+
+        self.input_embedders_parameters['goal'].scheme = [
+            Dense([128]),
+            Dense([128]),
+            Dense([128]),
+        ]
+
+        self.middleware_parameters = FCMiddlewareParameters(activation_function='leaky_relu',
+                                                            scheme=MiddlewareScheme.Empty)
+        self.heads_parameters = [MeasurementsPredictionHeadParameters(activation_function='leaky_relu')]
+        self.loss_weights = [1.0]
+        self.async_training = False
+        self.batch_size = 64
+        self.adam_optimizer_beta1 = 0.95
+
+
+class DFPMemoryParameters(EpisodicExperienceReplayParameters):
+    def __init__(self):
+        self.max_size = (MemoryGranularity.Transitions, 20000)
+        self.shared_memory = True
+        super().__init__()
+
+
+class DFPAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.num_predicted_steps_ahead = 6
+        self.goal_vector = [1.0, 1.0]
+        self.future_measurements_weights = [0.5, 0.5, 1.0]
+        self.use_accumulated_reward_as_measurement = False
+        self.handling_targets_after_episode_end = HandlingTargetsAfterEpisodeEnd.NAN
+        self.scale_measurements_targets = {}
+        self.num_consecutive_playing_steps = EnvironmentSteps(8)
+
+
+class DFPAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=DFPAlgorithmParameters(),
+                         exploration=EGreedyParameters(),
+                         memory=DFPMemoryParameters(),
+                         networks={"main": DFPNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.dfp_agent:DFPAgent'
+
+
+# Direct Future Prediction Agent - http://vladlen.info/papers/learning-to-act.pdf
+class DFPAgent(Agent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.current_goal = self.ap.algorithm.goal_vector
+        self.target_measurements_scale_factors = None
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        network_inputs = batch.states(network_keys)
+        network_inputs['goal'] = np.repeat(np.expand_dims(self.current_goal, 0),
+                                           self.ap.network_wrappers['main'].batch_size, axis=0)
+
+        # get the current outputs of the network
+        targets = self.networks['main'].online_network.predict(network_inputs)
+
+        # change the targets for the taken actions
+        for i in range(self.ap.network_wrappers['main'].batch_size):
+            targets[i, batch.actions()[i]] = batch[i].info['future_measurements'].flatten()
+
+        result = self.networks['main'].train_and_sync_networks(network_inputs, targets)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
+
+    def choose_action(self, curr_state):
+        if self.exploration_policy.requires_action_values():
+            # predict the future measurements
+            tf_input_state = self.prepare_batch_for_inference(curr_state, 'main')
+            tf_input_state['goal'] = np.expand_dims(self.current_goal, 0)
+            measurements_future_prediction = self.networks['main'].online_network.predict(tf_input_state)[0]
+            action_values = np.zeros(len(self.spaces.action.actions))
+            num_steps_used_for_objective = len(self.ap.algorithm.future_measurements_weights)
+
+            # calculate the score of each action by multiplying it's future measurements with the goal vector
+            for action_idx in range(len(self.spaces.action.actions)):
+                action_measurements = measurements_future_prediction[action_idx]
+                action_measurements = np.reshape(action_measurements,
+                                                 (self.ap.algorithm.num_predicted_steps_ahead,
+                                                  self.spaces.state['measurements'].shape[0]))
+                future_steps_values = np.dot(action_measurements, self.current_goal)
+                action_values[action_idx] = np.dot(future_steps_values[-num_steps_used_for_objective:],
+                                                   self.ap.algorithm.future_measurements_weights)
+        else:
+            action_values = None
+
+        # choose action according to the exploration policy and the current phase (evaluating or training the agent)
+        action = self.exploration_policy.get_action(action_values)
+
+        if action_values is not None:
+            action_values = action_values.squeeze()
+            action_info = ActionInfo(action=action, action_value=action_values[action])
+        else:
+            action_info = ActionInfo(action=action)
+
+        return action_info
+
+    def set_environment_parameters(self, spaces: SpacesDefinition):
+        self.spaces = copy.deepcopy(spaces)
+        self.spaces.goal = VectorObservationSpace(shape=self.spaces.state['measurements'].shape,
+                                                  measurements_names=
+                                                  self.spaces.state['measurements'].measurements_names)
+
+        # if the user has filled some scale values, check that he got the names right
+        if set(self.spaces.state['measurements'].measurements_names).intersection(
+                self.ap.algorithm.scale_measurements_targets.keys()) !=\
+                set(self.ap.algorithm.scale_measurements_targets.keys()):
+            raise ValueError("Some of the keys in parameter scale_measurements_targets ({})  are not defined in "
+                             "the measurements space {}".format(self.ap.algorithm.scale_measurements_targets.keys(),
+                                                                self.spaces.state['measurements'].measurements_names))
+
+        super().set_environment_parameters(self.spaces)
+
+        # the below is done after calling the base class method, as it might add accumulated reward as a measurement
+
+        # fill out the missing measurements scale factors
+        for measurement_name in self.spaces.state['measurements'].measurements_names:
+            if measurement_name not in self.ap.algorithm.scale_measurements_targets:
+                self.ap.algorithm.scale_measurements_targets[measurement_name] = 1
+
+        self.target_measurements_scale_factors = \
+            np.array([self.ap.algorithm.scale_measurements_targets[measurement_name] for measurement_name in
+                      self.spaces.state['measurements'].measurements_names])
+
+    def handle_episode_ended(self):
+        last_episode = self.current_episode_buffer
+        if self.phase in [RunPhase.TRAIN, RunPhase.HEATUP] and last_episode:
+            self._update_measurements_targets(last_episode,
+                                              self.ap.algorithm.num_predicted_steps_ahead)
+        super().handle_episode_ended()
+
+    def _update_measurements_targets(self, episode, num_steps):
+        if 'measurements' not in episode.transitions[0].state or episode.transitions[0].state['measurements'] == []:
+            raise ValueError("Measurements are not present in the transitions of the last episode played. ")
+        measurements_size = self.spaces.state['measurements'].shape[0]
+        for transition_idx, transition in enumerate(episode.transitions):
+            transition.info['future_measurements'] = np.zeros((num_steps, measurements_size))
+            for step in range(num_steps):
+                offset_idx = transition_idx + 2 ** step
+
+                if offset_idx >= episode.length():
+                    if self.ap.algorithm.handling_targets_after_episode_end == HandlingTargetsAfterEpisodeEnd.NAN:
+                        # the special MSE loss will ignore those entries so that the gradient will be 0 for these
+                        transition.info['future_measurements'][step] = np.nan
+                        continue
+
+                    elif self.ap.algorithm.handling_targets_after_episode_end == HandlingTargetsAfterEpisodeEnd.LastStep:
+                        offset_idx = - 1
+
+                transition.info['future_measurements'][step] = \
+                    self.target_measurements_scale_factors * \
+                    (episode.transitions[offset_idx].state['measurements'] - transition.state['measurements'])
diff --git a/rl_coach/agents/dqn_agent.py b/rl_coach/agents/dqn_agent.py
new file mode 100644
index 0000000..4858f86
--- /dev/null
+++ b/rl_coach/agents/dqn_agent.py
@@ -0,0 +1,99 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+from rl_coach.architectures.tensorflow_components.heads.q_head import QHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import AlgorithmParameters, NetworkParameters, AgentParameters, \
+    InputEmbedderParameters, MiddlewareScheme
+from rl_coach.memories.non_episodic.experience_replay import ExperienceReplayParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import EnvironmentSteps
+from rl_coach.exploration_policies.e_greedy import EGreedyParameters
+
+
+class DQNAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(10000)
+        self.num_consecutive_playing_steps = EnvironmentSteps(4)
+        self.discount = 0.99
+
+
+class DQNNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters()}
+        self.middleware_parameters = FCMiddlewareParameters(scheme=MiddlewareScheme.Medium)
+        self.heads_parameters = [QHeadParameters()]
+        self.loss_weights = [1.0]
+        self.optimizer_type = 'Adam'
+        self.batch_size = 32
+        self.replace_mse_with_huber_loss = True
+        self.create_target_network = True
+
+
+class DQNAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=DQNAlgorithmParameters(),
+                         exploration=EGreedyParameters(),
+                         memory=ExperienceReplayParameters(),
+                         networks={"main": DQNNetworkParameters()})
+        self.exploration.epsilon_schedule = LinearSchedule(1, 0.1, 1000000)
+        self.exploration.evaluation_epsilon = 0.05
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.dqn_agent:DQNAgent'
+
+
+# Deep Q Network - https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf
+class DQNAgent(ValueOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        # for the action we actually took, the error is:
+        # TD error = r + discount*max(q_st_plus_1) - q_st
+        # # for all other actions, the error is 0
+        q_st_plus_1, TD_targets = self.networks['main'].parallel_prediction([
+            (self.networks['main'].target_network, batch.next_states(network_keys)),
+            (self.networks['main'].online_network, batch.states(network_keys))
+        ])
+
+        #  only update the action that we have actually done in this transition
+        TD_errors = []
+        for i in range(self.ap.network_wrappers['main'].batch_size):
+            new_target = batch.rewards()[i] +\
+                         (1.0 - batch.game_overs()[i]) * self.ap.algorithm.discount * np.max(q_st_plus_1[i], 0)
+            TD_errors.append(np.abs(new_target - TD_targets[i, batch.actions()[i]]))
+            TD_targets[i, batch.actions()[i]] = new_target
+
+        # update errors in prioritized replay buffer
+        importance_weights = self.update_transition_priorities_and_get_weights(TD_errors, batch)
+
+        result = self.networks['main'].train_and_sync_networks(batch.states(network_keys), TD_targets,
+                                                               importance_weights=importance_weights)
+
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
diff --git a/rl_coach/agents/hac_ddpg_agent.py b/rl_coach/agents/hac_ddpg_agent.py
new file mode 100644
index 0000000..e87969e
--- /dev/null
+++ b/rl_coach/agents/hac_ddpg_agent.py
@@ -0,0 +1,108 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+import copy
+
+from rl_coach.agents.ddpg_agent import DDPGAgent, DDPGAgentParameters, DDPGAlgorithmParameters
+from rl_coach.core_types import RunPhase
+from rl_coach.spaces import SpacesDefinition
+
+
+class HACDDPGAlgorithmParameters(DDPGAlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.time_limit = 40
+        self.sub_goal_testing_rate = 0.5
+
+
+class HACDDPGAgentParameters(DDPGAgentParameters):
+    def __init__(self):
+        super().__init__()
+        self.algorithm = HACDDPGAlgorithmParameters()
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.hac_ddpg_agent:HACDDPGAgent'
+
+
+# Hierarchical Actor Critic Generating Subgoals DDPG Agent - https://arxiv.org/pdf/1712.00948.pdf
+class HACDDPGAgent(DDPGAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.sub_goal_testing_rate = self.ap.algorithm.sub_goal_testing_rate
+        self.graph_manager = None
+
+    def choose_action(self, curr_state):
+        # top level decides, for each of his generated sub-goals, for all the layers beneath him if this is a sub-goal
+        # testing phase
+
+        graph_manager = self.parent_level_manager.parent_graph_manager
+        if self.ap.is_a_highest_level_agent:
+            graph_manager.should_test_current_sub_goal = np.random.rand() < self.sub_goal_testing_rate
+
+        if self.phase == RunPhase.TRAIN:
+            if graph_manager.should_test_current_sub_goal:
+                self.exploration_policy.change_phase(RunPhase.TEST)
+            else:
+                self.exploration_policy.change_phase(self.phase)
+
+        action_info = super().choose_action(curr_state)
+        return action_info
+
+    def update_transition_before_adding_to_replay_buffer(self, transition):
+        graph_manager = self.parent_level_manager.parent_graph_manager
+
+        # deal with goals given from a higher level agent
+        if not self.ap.is_a_highest_level_agent:
+            transition.state['desired_goal'] = self.current_hrl_goal
+            transition.next_state['desired_goal'] = self.current_hrl_goal
+            # TODO: allow setting goals which are not part of the state. e.g. state-embedding using get_prediction
+            self.distance_from_goal.add_sample(self.spaces.goal.distance_from_goal(
+                self.current_hrl_goal, transition.next_state))
+            goal_reward, sub_goal_reached = self.spaces.goal.get_reward_for_goal_and_state(
+                self.current_hrl_goal, transition.next_state)
+            transition.reward = goal_reward
+            transition.game_over = transition.game_over or sub_goal_reached
+
+        # each level tests its own generated sub goals
+        if not self.ap.is_a_lowest_level_agent and graph_manager.should_test_current_sub_goal:
+            #TODO-fixme
+            # _, sub_goal_reached = self.parent_level_manager.environment.agents['agent_1'].spaces.goal.\
+            # get_reward_for_goal_and_state(transition.action, transition.next_state)
+
+            _, sub_goal_reached = self.spaces.goal.get_reward_for_goal_and_state(
+                transition.action, transition.next_state)
+
+            sub_goal_is_missed = not sub_goal_reached
+
+            if sub_goal_is_missed:
+                    transition.reward = -self.ap.algorithm.time_limit
+        return transition
+
+    def set_environment_parameters(self, spaces: SpacesDefinition):
+        super().set_environment_parameters(spaces)
+
+        if self.ap.is_a_highest_level_agent:
+            # the rest of the levels already have an in_action_space set to be of type GoalsSpace, thus they will have
+            # their GoalsSpace set to the in_action_space in agent.set_environment_parameters()
+            self.spaces.goal = self.spaces.action
+            self.spaces.goal.set_target_space(self.spaces.state[self.spaces.goal.goal_name])
+
+        if not self.ap.is_a_highest_level_agent:
+            self.spaces.reward.reward_success_threshold = self.spaces.goal.reward_type.goal_reaching_reward
diff --git a/rl_coach/agents/human_agent.py b/rl_coach/agents/human_agent.py
new file mode 100644
index 0000000..d839fe3
--- /dev/null
+++ b/rl_coach/agents/human_agent.py
@@ -0,0 +1,115 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import os
+from collections import OrderedDict
+from typing import Union
+
+import pygame
+from rl_coach.agents.agent import Agent
+from rl_coach.agents.bc_agent import BCNetworkParameters
+from rl_coach.architectures.tensorflow_components.heads.policy_head import PolicyHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import AlgorithmParameters, NetworkParameters, InputEmbedderParameters, EmbedderScheme, \
+    AgentParameters
+from rl_coach.core_types import ActionInfo
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters
+from pandas import to_pickle
+
+from rl_coach.exploration_policies.e_greedy import EGreedyParameters
+from rl_coach.logger import screen
+
+
+class HumanAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+
+
+class HumanNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters()}
+        self.input_embedders_parameters['observation'].scheme = EmbedderScheme.Medium
+        self.middleware_parameters = FCMiddlewareParameters()
+        self.heads_parameters = [PolicyHeadParameters()]
+        self.loss_weights = [1.0]
+        self.optimizer_type = 'Adam'
+        self.batch_size = 32
+        self.replace_mse_with_huber_loss = False
+        self.create_target_network = False
+
+
+class HumanAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=HumanAlgorithmParameters(),
+                         exploration=EGreedyParameters(),
+                         memory=EpisodicExperienceReplayParameters(),
+                         networks={"main": BCNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.human_agent:HumanAgent'
+
+
+class HumanAgent(Agent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+
+        self.clock = pygame.time.Clock()
+        self.max_fps = int(self.ap.visualization.max_fps_for_human_control)
+        self.env = None
+
+    def init_environment_dependent_modules(self):
+        super().init_environment_dependent_modules()
+        self.env = self.parent_level_manager._real_environment
+        screen.log_title("Human Control Mode")
+        available_keys = self.env.get_available_keys()
+        if available_keys:
+            screen.log("Use keyboard keys to move. Press escape to quit. Available keys:")
+            screen.log("")
+            for action, key in self.env.get_available_keys():
+                screen.log("\t- {}: {}".format(action, key))
+            screen.separator()
+
+    def train(self):
+        return 0
+
+    def choose_action(self, curr_state):
+        action = ActionInfo(self.env.get_action_from_user(), action_value=0)
+        action = self.output_filter.reverse_filter(action)
+
+        # keep constant fps
+        self.clock.tick(self.max_fps)
+
+        if not self.env.renderer.is_open:
+            self.save_replay_buffer_and_exit()
+
+        return action
+
+    def save_replay_buffer_and_exit(self):
+        replay_buffer_path = os.path.join(self.agent_logger.experiments_path, 'replay_buffer.p')
+        self.memory.tp = None
+        to_pickle(self.memory, replay_buffer_path)
+        screen.log_title("Replay buffer was stored in {}".format(replay_buffer_path))
+        exit()
+
+    def log_to_screen(self):
+        # log to screen
+        log = OrderedDict()
+        log["Episode"] = self.current_episode
+        log["Total reward"] = round(self.total_reward_in_current_episode, 2)
+        log["Steps"] = self.total_steps_counter
+        screen.log_dict(log, prefix="Recording")
diff --git a/rl_coach/agents/imitation_agent.py b/rl_coach/agents/imitation_agent.py
new file mode 100644
index 0000000..9136726
--- /dev/null
+++ b/rl_coach/agents/imitation_agent.py
@@ -0,0 +1,76 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from collections import OrderedDict
+from typing import Union
+
+from rl_coach.core_types import RunPhase, ActionInfo
+from rl_coach.spaces import DiscreteActionSpace
+
+from rl_coach.agents.agent import Agent
+from rl_coach.logger import screen
+
+
+## This is an abstract agent - there is no learn_from_batch method ##
+
+# Imitation Agent
+class ImitationAgent(Agent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+
+        self.imitation = True
+
+    def extract_action_values(self, prediction):
+        return prediction.squeeze()
+
+    def choose_action(self, curr_state):
+        # convert to batch so we can run it through the network
+        prediction = self.networks['main'].online_network.predict(self.prepare_batch_for_inference(curr_state, 'main'))
+
+        # get action values and extract the best action from it
+        action_values = self.extract_action_values(prediction)
+        if type(self.spaces.action) == DiscreteActionSpace:
+            # DISCRETE
+            self.exploration_policy.phase = RunPhase.TEST
+            action = self.exploration_policy.get_action(action_values)
+
+            action_info = ActionInfo(action=action,
+                                     action_probability=action_values[action])
+        else:
+            # CONTINUOUS
+            action = action_values
+
+            action_info = ActionInfo(action=action)
+
+        return action_info
+
+    def log_to_screen(self):
+        # log to screen
+        if self.phase == RunPhase.TRAIN:
+            # for the training phase - we log during the episode to visualize the progress in training
+            log = OrderedDict()
+            if self.task_id is not None:
+                log["Worker"] = self.task_id
+            log["Episode"] = self.current_episode
+            log["Loss"] = self.loss.values[-1]
+            log["Training iteration"] = self.training_iteration
+            screen.log_dict(log, prefix="Training")
+        else:
+            # for the evaluation phase - logging as in regular RL
+            super().log_to_screen()
+
+    def learn_from_batch(self, batch):
+        raise NotImplementedError("ImitationAgent is an abstract agent. Not to be used directly.")
diff --git a/rl_coach/agents/mmc_agent.py b/rl_coach/agents/mmc_agent.py
new file mode 100644
index 0000000..3ce23e1
--- /dev/null
+++ b/rl_coach/agents/mmc_agent.py
@@ -0,0 +1,72 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters, DQNAlgorithmParameters
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters
+
+
+class MixedMonteCarloAlgorithmParameters(DQNAlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.monte_carlo_mixing_rate = 0.1
+
+
+class MixedMonteCarloAgentParameters(DQNAgentParameters):
+    def __init__(self):
+        super().__init__()
+        self.algorithm = MixedMonteCarloAlgorithmParameters()
+        self.memory = EpisodicExperienceReplayParameters()
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.mmc_agent:MixedMonteCarloAgent'
+
+
+class MixedMonteCarloAgent(ValueOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.mixing_rate = agent_parameters.algorithm.monte_carlo_mixing_rate
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        # for the 1-step, we use the double-dqn target. hence actions are taken greedily according to the online network
+        selected_actions = np.argmax(self.networks['main'].online_network.predict(batch.next_states(network_keys)), 1)
+
+        # TD_targets are initialized with the current prediction so that we will
+        #  only update the action that we have actually done in this transition
+        q_st_plus_1, TD_targets = self.networks['main'].parallel_prediction([
+            (self.networks['main'].target_network, batch.next_states(network_keys)),
+            (self.networks['main'].online_network, batch.states(network_keys))
+        ])
+
+        for i in range(self.ap.network_wrappers['main'].batch_size):
+            one_step_target = batch.rewards()[i] + \
+                              (1.0 - batch.game_overs()[i]) * self.ap.algorithm.discount * \
+                              q_st_plus_1[i][selected_actions[i]]
+            monte_carlo_target = batch.total_returns()[i]
+            TD_targets[i, batch.actions()[i]] = (1 - self.mixing_rate) * one_step_target + \
+                                                self.mixing_rate * monte_carlo_target
+
+        result = self.networks['main'].train_and_sync_networks(batch.states(network_keys), TD_targets)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
diff --git a/rl_coach/agents/n_step_q_agent.py b/rl_coach/agents/n_step_q_agent.py
new file mode 100644
index 0000000..227c122
--- /dev/null
+++ b/rl_coach/agents/n_step_q_agent.py
@@ -0,0 +1,126 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.policy_optimization_agent import PolicyOptimizationAgent
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+from rl_coach.architectures.tensorflow_components.heads.q_head import QHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import AlgorithmParameters, AgentParameters, NetworkParameters, \
+    InputEmbedderParameters
+from rl_coach.exploration_policies.e_greedy import EGreedyParameters
+from rl_coach.utils import last_sample
+
+from rl_coach.core_types import EnvironmentSteps
+from rl_coach.memories.episodic.single_episode_buffer import SingleEpisodeBufferParameters
+
+
+class NStepQNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters()}
+        self.middleware_parameters = FCMiddlewareParameters()
+        self.heads_parameters = [QHeadParameters()]
+        self.loss_weights = [1.0]
+        self.optimizer_type = 'Adam'
+        self.async_training = True
+        self.shared_optimizer = True
+        self.create_target_network = True
+
+
+class NStepQAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(10000)
+        self.apply_gradients_every_x_episodes = 1
+        self.num_steps_between_gradient_updates = 5  # this is called t_max in all the papers
+        self.targets_horizon = 'N-Step'
+
+
+class NStepQAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=NStepQAlgorithmParameters(),
+                         exploration=EGreedyParameters(),
+                         memory=SingleEpisodeBufferParameters(),
+                         networks={"main": NStepQNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.n_step_q_agent:NStepQAgent'
+
+
+# N Step Q Learning Agent - https://arxiv.org/abs/1602.01783
+class NStepQAgent(ValueOptimizationAgent, PolicyOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.last_gradient_update_step_idx = 0
+        self.q_values = self.register_signal('Q Values')
+        self.value_loss = self.register_signal('Value Loss')
+
+    def learn_from_batch(self, batch):
+        # batch contains a list of episodes to learn from
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        # get the values for the current states
+        state_value_head_targets = self.networks['main'].online_network.predict(batch.states(network_keys))
+
+        # the targets for the state value estimator
+        if self.ap.algorithm.targets_horizon == '1-Step':
+            # 1-Step Q learning
+            q_st_plus_1 = self.networks['main'].target_network.predict(batch.next_states(network_keys))
+
+            for i in reversed(range(batch.size)):
+                state_value_head_targets[i][batch.actions()[i]] = \
+                    batch.rewards()[i] \
+                    + (1.0 - batch.game_overs()[i]) * self.ap.algorithm.discount * np.max(q_st_plus_1[i], 0)
+
+        elif self.ap.algorithm.targets_horizon == 'N-Step':
+            # N-Step Q learning
+            if batch.game_overs()[-1]:
+                R = 0
+            else:
+                R = np.max(self.networks['main'].target_network.predict(last_sample(batch.next_states(network_keys))))
+
+            for i in reversed(range(batch.size)):
+                R = batch.rewards()[i] + self.ap.algorithm.discount * R
+                state_value_head_targets[i][batch.actions()[i]] = R
+
+        else:
+            assert True, 'The available values for targets_horizon are: 1-Step, N-Step'
+
+        # train
+        result = self.networks['main'].online_network.accumulate_gradients(batch.states(network_keys), [state_value_head_targets])
+
+        # logging
+        total_loss, losses, unclipped_grads = result[:3]
+        self.value_loss.add_sample(losses[0])
+
+        return total_loss, losses, unclipped_grads
+
+    def train(self):
+        # update the target network of every network that has a target network
+        if any([network.has_target for network in self.networks.values()]) \
+                and self._should_update_online_weights_to_target():
+            for network in self.networks.values():
+                network.update_target_network(self.ap.algorithm.rate_for_copying_weights_to_target)
+
+            self.agent_logger.create_signal_value('Update Target Network', 1)
+        else:
+            self.agent_logger.create_signal_value('Update Target Network', 0, overwrite=False)
+
+        return PolicyOptimizationAgent.train(self)
diff --git a/rl_coach/agents/naf_agent.py b/rl_coach/agents/naf_agent.py
new file mode 100644
index 0000000..7a218fb
--- /dev/null
+++ b/rl_coach/agents/naf_agent.py
@@ -0,0 +1,126 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+from rl_coach.architectures.tensorflow_components.heads.naf_head import NAFHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import AlgorithmParameters, AgentParameters, \
+    NetworkParameters, InputEmbedderParameters
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters
+from rl_coach.spaces import BoxActionSpace
+
+from rl_coach.core_types import ActionInfo, EnvironmentSteps
+from rl_coach.exploration_policies.ou_process import OUProcessParameters
+
+
+class NAFNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters()}
+        self.middleware_parameters = FCMiddlewareParameters()
+        self.heads_parameters = [NAFHeadParameters()]
+        self.loss_weights = [1.0]
+        self.optimizer_type = 'Adam'
+        self.learning_rate = 0.001
+        self.async_training = True
+        self.create_target_network = True
+
+
+class NAFAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.num_consecutive_training_steps = 5
+        self.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(1)
+        self.rate_for_copying_weights_to_target = 0.001
+
+
+class NAFAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=NAFAlgorithmParameters(),
+                         exploration=OUProcessParameters(),
+                         memory=EpisodicExperienceReplayParameters(),
+                         networks={"main": NAFNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.naf_agent:NAFAgent'
+
+
+# Normalized Advantage Functions - https://arxiv.org/pdf/1603.00748.pdf
+class NAFAgent(ValueOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.l_values = self.register_signal("L")
+        self.a_values = self.register_signal("Advantage")
+        self.mu_values = self.register_signal("Action")
+        self.v_values = self.register_signal("V")
+        self.TD_targets = self.register_signal("TD targets")
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        # TD error = r + discount*v_st_plus_1 - q_st
+        v_st_plus_1 = self.networks['main'].target_network.predict(
+            batch.next_states(network_keys),
+            self.networks['main'].target_network.output_heads[0].V,
+            squeeze_output=False,
+        )
+        TD_targets = np.expand_dims(batch.rewards(), -1) + \
+                     (1.0 - np.expand_dims(batch.game_overs(), -1)) * self.ap.algorithm.discount * v_st_plus_1
+
+        self.TD_targets.add_sample(TD_targets)
+
+        result = self.networks['main'].train_and_sync_networks({**batch.states(network_keys),
+                                                                'output_0_0': batch.actions(len(batch.actions().shape) == 1)
+                                                                }, TD_targets)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
+
+    def choose_action(self, curr_state):
+        if type(self.spaces.action) != BoxActionSpace:
+            raise ValueError('NAF works only for continuous control problems')
+
+        # convert to batch so we can run it through the network
+        tf_input_state = self.prepare_batch_for_inference(curr_state, 'main')
+        naf_head = self.networks['main'].online_network.output_heads[0]
+        action_values = self.networks['main'].online_network.predict(tf_input_state, outputs=naf_head.mu,
+                                                                     squeeze_output=False)
+
+        # get the actual action to use
+        action = self.exploration_policy.get_action(action_values)
+
+        # get the internal values for logging
+        outputs = [naf_head.mu, naf_head.Q, naf_head.L, naf_head.A, naf_head.V]
+        result = self.networks['main'].online_network.predict(
+            {**tf_input_state, 'output_0_0': action_values},
+            outputs=outputs
+        )
+        mu, Q, L, A, V = result
+
+        # store the q values statistics for logging
+        self.q_values.add_sample(Q)
+        self.l_values.add_sample(L)
+        self.a_values.add_sample(A)
+        self.mu_values.add_sample(mu)
+        self.v_values.add_sample(V)
+
+        action_info = ActionInfo(action=action, action_value=Q)
+        
+        return action_info
diff --git a/rl_coach/agents/nec_agent.py b/rl_coach/agents/nec_agent.py
new file mode 100644
index 0000000..c135b3c
--- /dev/null
+++ b/rl_coach/agents/nec_agent.py
@@ -0,0 +1,176 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import os
+import pickle
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+from rl_coach.architectures.tensorflow_components.heads.dnd_q_head import DNDQHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import AlgorithmParameters, NetworkParameters, AgentParameters, \
+    InputEmbedderParameters
+from rl_coach.core_types import RunPhase, EnvironmentSteps, Episode, StateType
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters, MemoryGranularity
+from rl_coach.schedules import ConstantSchedule
+
+from rl_coach.exploration_policies.e_greedy import EGreedyParameters
+from rl_coach.logger import screen
+
+
+class NECNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters()}
+        self.middleware_parameters = FCMiddlewareParameters()
+        self.heads_parameters = [DNDQHeadParameters()]
+        self.loss_weights = [1.0]
+        self.rescale_gradient_from_head_by_factor = [1]
+        self.optimizer_type = 'Adam'
+
+
+class NECAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.dnd_size = 500000
+        self.l2_norm_added_delta = 0.001
+        self.new_value_shift_coefficient = 0.1
+        self.number_of_knn = 50
+        self.DND_key_error_threshold = 0
+        self.num_consecutive_playing_steps = EnvironmentSteps(4)
+        self.propagate_updates_to_DND = False
+        self.n_step = 100
+        self.bootstrap_total_return_from_old_policy = True
+
+
+class NECMemoryParameters(EpisodicExperienceReplayParameters):
+    def __init__(self):
+        super().__init__()
+        self.max_size = (MemoryGranularity.Transitions, 100000)
+
+
+class NECAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=NECAlgorithmParameters(),
+                         exploration=EGreedyParameters(),
+                         memory=NECMemoryParameters(),
+                         networks={"main": NECNetworkParameters()})
+        self.exploration.epsilon_schedule = ConstantSchedule(0.1)
+        self.exploration.evaluation_epsilon = 0.01
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.nec_agent:NECAgent'
+
+
+# Neural Episodic Control - https://arxiv.org/pdf/1703.01988.pdf
+class NECAgent(ValueOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.current_episode_state_embeddings = []
+        self.training_started = False
+        self.current_episode_buffer = \
+            Episode(discount=self.ap.algorithm.discount,
+                    n_step=self.ap.algorithm.n_step,
+                    bootstrap_total_return_from_old_policy=self.ap.algorithm.bootstrap_total_return_from_old_policy)
+
+    def learn_from_batch(self, batch):
+        if not self.networks['main'].online_network.output_heads[0].DND.has_enough_entries(self.ap.algorithm.number_of_knn):
+            return 0, [], 0
+        else:
+            if not self.training_started:
+                self.training_started = True
+                screen.log_title("Finished collecting initial entries in DND. Starting to train network...")
+
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        TD_targets = self.networks['main'].online_network.predict(batch.states(network_keys))
+
+        #  only update the action that we have actually done in this transition
+        for i in range(self.ap.network_wrappers['main'].batch_size):
+            TD_targets[i, batch.actions()[i]] = batch.total_returns()[i]
+
+        # set the gradients to fetch for the DND update
+        fetches = []
+        head = self.networks['main'].online_network.output_heads[0]
+        if self.ap.algorithm.propagate_updates_to_DND:
+            fetches = [head.dnd_embeddings_grad, head.dnd_values_grad, head.dnd_indices]
+
+        # train the neural network
+        result = self.networks['main'].train_and_sync_networks(batch.states(network_keys), TD_targets, fetches)
+
+        total_loss, losses, unclipped_grads = result[:3]
+
+        # update the DND keys and values using the extracted gradients
+        if self.ap.algorithm.propagate_updates_to_DND:
+            embedding_gradients = np.swapaxes(result[-1][0], 0, 1)
+            value_gradients = np.swapaxes(result[-1][1], 0, 1)
+            indices = np.swapaxes(result[-1][2], 0, 1)
+            head.DND.update_keys_and_values(batch.actions(), embedding_gradients, value_gradients, indices)
+
+        return total_loss, losses, unclipped_grads
+
+    def act(self):
+        if self.phase == RunPhase.HEATUP:
+            # get embedding in heatup (otherwise we get it through get_prediction)
+            embedding = self.networks['main'].online_network.predict(
+                self.prepare_batch_for_inference(self.curr_state, 'main'),
+                outputs=self.networks['main'].online_network.state_embedding)
+            self.current_episode_state_embeddings.append(embedding)
+
+        return super().act()
+
+    def get_all_q_values_for_states(self, states: StateType):
+        # we need to store the state embeddings regardless if the action is random or not
+        return self.get_prediction(states)
+
+    def get_prediction(self, states):
+        # get the actions q values and the state embedding
+        embedding, actions_q_values = self.networks['main'].online_network.predict(
+            self.prepare_batch_for_inference(states, 'main'),
+            outputs=[self.networks['main'].online_network.state_embedding,
+                     self.networks['main'].online_network.output_heads[0].output]
+        )
+        if self.phase != RunPhase.TEST:
+            # store the state embedding for inserting it to the DND later
+            self.current_episode_state_embeddings.append(embedding.squeeze())
+        actions_q_values = actions_q_values[0][0]
+        return actions_q_values
+
+    def reset_internal_state(self):
+        super().reset_internal_state()
+        self.current_episode_state_embeddings = []
+        self.current_episode_buffer = \
+            Episode(discount=self.ap.algorithm.discount,
+                    n_step=self.ap.algorithm.n_step,
+                    bootstrap_total_return_from_old_policy=self.ap.algorithm.bootstrap_total_return_from_old_policy)
+
+    def handle_episode_ended(self):
+        super().handle_episode_ended()
+
+        # get the last full episode that we have collected
+        episode = self.call_memory('get_last_complete_episode')
+        if episode is not None and self.phase != RunPhase.TEST:
+            assert len(self.current_episode_state_embeddings) == episode.length()
+            returns = episode.get_transitions_attribute('total_return')
+            actions = episode.get_transitions_attribute('action')
+            self.networks['main'].online_network.output_heads[0].DND.add(self.current_episode_state_embeddings,
+                                                                         actions, returns)
+
+    def save_checkpoint(self, checkpoint_id):
+        with open(os.path.join(self.ap.task_parameters.save_checkpoint_dir, str(checkpoint_id) + '.dnd'), 'wb') as f:
+            pickle.dump(self.networks['main'].online_network.output_heads[0].DND, f, pickle.HIGHEST_PROTOCOL)
diff --git a/rl_coach/agents/pal_agent.py b/rl_coach/agents/pal_agent.py
new file mode 100644
index 0000000..3c1d657
--- /dev/null
+++ b/rl_coach/agents/pal_agent.py
@@ -0,0 +1,94 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters, DQNAlgorithmParameters
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplay, \
+    EpisodicExperienceReplayParameters
+
+
+class PALAlgorithmParameters(DQNAlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.pal_alpha = 0.9
+        self.persistent_advantage_learning = False
+        self.monte_carlo_mixing_rate = 0.1
+
+
+class PALAgentParameters(DQNAgentParameters):
+    def __init__(self):
+        super().__init__()
+        self.algorithm = PALAlgorithmParameters()
+        self.memory = EpisodicExperienceReplayParameters()
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.pal_agent:PALAgent'
+
+
+# Persistent Advantage Learning - https://arxiv.org/pdf/1512.04860.pdf
+class PALAgent(ValueOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.alpha = agent_parameters.algorithm.pal_alpha
+        self.persistent = agent_parameters.algorithm.persistent_advantage_learning
+        self.monte_carlo_mixing_rate = agent_parameters.algorithm.monte_carlo_mixing_rate
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        # next state values
+        q_st_plus_1_target, q_st_plus_1_online = self.networks['main'].parallel_prediction([
+            (self.networks['main'].target_network, batch.next_states(network_keys)),
+            (self.networks['main'].online_network, batch.next_states(network_keys))
+        ])
+        selected_actions = np.argmax(q_st_plus_1_online, 1)
+        v_st_plus_1_target = np.max(q_st_plus_1_target, 1)
+
+        # current state values
+        q_st_target, q_st_online = self.networks['main'].parallel_prediction([
+            (self.networks['main'].target_network, batch.states(network_keys)),
+            (self.networks['main'].online_network, batch.states(network_keys))
+        ])
+        v_st_target = np.max(q_st_target, 1)
+
+        # calculate TD error
+        TD_targets = np.copy(q_st_online)
+        for i in range(self.ap.network_wrappers['main'].batch_size):
+            TD_targets[i, batch.actions()[i]] = batch.rewards()[i] + \
+                                        (1.0 - batch.game_overs()[i]) * self.ap.algorithm.discount * \
+                                                     q_st_plus_1_target[i][selected_actions[i]]
+            advantage_learning_update = v_st_target[i] - q_st_target[i, batch.actions()[i]]
+            next_advantage_learning_update = v_st_plus_1_target[i] - q_st_plus_1_target[i, selected_actions[i]]
+            # Persistent Advantage Learning or Regular Advantage Learning
+            if self.persistent:
+                TD_targets[i, batch.actions()[i]] -= self.alpha * min(advantage_learning_update, next_advantage_learning_update)
+            else:
+                TD_targets[i, batch.actions()[i]] -= self.alpha * advantage_learning_update
+
+            # mixing monte carlo updates
+            monte_carlo_target = batch.total_returns()[i]
+            TD_targets[i, batch.actions()[i]] = (1 - self.monte_carlo_mixing_rate) * TD_targets[i, batch.actions()[i]] \
+                                        + self.monte_carlo_mixing_rate * monte_carlo_target
+
+        result = self.networks['main'].train_and_sync_networks(batch.states(network_keys), TD_targets)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
diff --git a/rl_coach/agents/policy_gradients_agent.py b/rl_coach/agents/policy_gradients_agent.py
new file mode 100644
index 0000000..4d67e42
--- /dev/null
+++ b/rl_coach/agents/policy_gradients_agent.py
@@ -0,0 +1,105 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.policy_optimization_agent import PolicyOptimizationAgent, PolicyGradientRescaler
+from rl_coach.architectures.tensorflow_components.heads.policy_head import PolicyHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import NetworkParameters, AlgorithmParameters, \
+    AgentParameters, InputEmbedderParameters
+from rl_coach.exploration_policies.additive_noise import AdditiveNoiseParameters
+from rl_coach.spaces import DiscreteActionSpace
+
+from rl_coach.logger import screen
+from rl_coach.memories.episodic.single_episode_buffer import SingleEpisodeBufferParameters
+
+
+class PolicyGradientNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters()}
+        self.middleware_parameters = FCMiddlewareParameters()
+        self.heads_parameters = [PolicyHeadParameters()]
+        self.loss_weights = [1.0]
+        self.async_training = True
+
+
+class PolicyGradientAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.policy_gradient_rescaler = PolicyGradientRescaler.FUTURE_RETURN_NORMALIZED_BY_TIMESTEP
+        self.apply_gradients_every_x_episodes = 5
+        self.beta_entropy = 0
+        self.num_steps_between_gradient_updates = 20000  # this is called t_max in all the papers
+
+
+class PolicyGradientsAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=PolicyGradientAlgorithmParameters(),
+                         exploration=AdditiveNoiseParameters(),
+                         memory=SingleEpisodeBufferParameters(),
+                         networks={"main": PolicyGradientNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.policy_gradients_agent:PolicyGradientsAgent'
+
+
+class PolicyGradientsAgent(PolicyOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.returns_mean = self.register_signal('Returns Mean')
+        self.returns_variance = self.register_signal('Returns Variance')
+        self.last_gradient_update_step_idx = 0
+
+    def learn_from_batch(self, batch):
+        # batch contains a list of episodes to learn from
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        total_returns = batch.total_returns()
+        for i in reversed(range(batch.size)):
+            if self.policy_gradient_rescaler == PolicyGradientRescaler.TOTAL_RETURN:
+                total_returns[i] = total_returns[0]
+            elif self.policy_gradient_rescaler == PolicyGradientRescaler.FUTURE_RETURN:
+                # just take the total return as it is
+                pass
+            elif self.policy_gradient_rescaler == PolicyGradientRescaler.FUTURE_RETURN_NORMALIZED_BY_EPISODE:
+                # we can get a single transition episode while playing Doom Basic, causing the std to be 0
+                if self.std_discounted_return != 0:
+                    total_returns[i] = (total_returns[i] - self.mean_discounted_return) / self.std_discounted_return
+                else:
+                    total_returns[i] = 0
+            elif self.policy_gradient_rescaler == PolicyGradientRescaler.FUTURE_RETURN_NORMALIZED_BY_TIMESTEP:
+                total_returns[i] -= self.mean_return_over_multiple_episodes[i]
+            else:
+                screen.warning("WARNING: The requested policy gradient rescaler is not available")
+
+        targets = total_returns
+        actions = batch.actions()
+        if type(self.spaces.action) != DiscreteActionSpace and len(actions.shape) < 2:
+            actions = np.expand_dims(actions, -1)
+
+        self.returns_mean.add_sample(np.mean(total_returns))
+        self.returns_variance.add_sample(np.std(total_returns))
+
+        result = self.networks['main'].online_network.accumulate_gradients(
+            {**batch.states(network_keys), 'output_0_0': actions}, targets
+        )
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
diff --git a/rl_coach/agents/policy_optimization_agent.py b/rl_coach/agents/policy_optimization_agent.py
new file mode 100644
index 0000000..1c16ce8
--- /dev/null
+++ b/rl_coach/agents/policy_optimization_agent.py
@@ -0,0 +1,166 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from collections import OrderedDict
+from enum import Enum
+from typing import Union
+
+import numpy as np
+from rl_coach.core_types import Batch, ActionInfo
+from rl_coach.spaces import DiscreteActionSpace, BoxActionSpace
+from rl_coach.utils import eps
+
+from rl_coach.agents.agent import Agent
+from rl_coach.logger import screen
+
+
+class PolicyGradientRescaler(Enum):
+    TOTAL_RETURN = 0
+    FUTURE_RETURN = 1
+    FUTURE_RETURN_NORMALIZED_BY_EPISODE = 2
+    FUTURE_RETURN_NORMALIZED_BY_TIMESTEP = 3  # baselined
+    Q_VALUE = 4
+    A_VALUE = 5
+    TD_RESIDUAL = 6
+    DISCOUNTED_TD_RESIDUAL = 7
+    GAE = 8
+
+
+## This is an abstract agent - there is no learn_from_batch method ##
+
+
+class PolicyOptimizationAgent(Agent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+
+        self.policy_gradient_rescaler = None
+        if hasattr(self.ap.algorithm, 'policy_gradient_rescaler'):
+            self.policy_gradient_rescaler = self.ap.algorithm.policy_gradient_rescaler
+
+        # statistics for variance reduction
+        self.last_gradient_update_step_idx = 0
+        self.max_episode_length = 100000
+        self.mean_return_over_multiple_episodes = np.zeros(self.max_episode_length)
+        self.num_episodes_where_step_has_been_seen = np.zeros(self.max_episode_length)
+        self.entropy = self.register_signal('Entropy')
+
+    def log_to_screen(self):
+        # log to screen
+        log = OrderedDict()
+        log["Name"] = self.full_name_id
+        if self.task_id is not None:
+            log["Worker"] = self.task_id
+        log["Episode"] = self.current_episode
+        log["Total reward"] = round(self.total_reward_in_current_episode, 2)
+        log["Steps"] = self.total_steps_counter
+        log["Training iteration"] = self.training_iteration
+        screen.log_dict(log, prefix=self.phase.value)
+
+    def update_episode_statistics(self, episode):
+        episode_discounted_returns = []
+        for i in range(episode.length()):
+            transition = episode.get_transition(i)
+            episode_discounted_returns.append(transition.total_return)
+            self.num_episodes_where_step_has_been_seen[i] += 1
+            self.mean_return_over_multiple_episodes[i] -= self.mean_return_over_multiple_episodes[i] / \
+                                                          self.num_episodes_where_step_has_been_seen[i]
+            self.mean_return_over_multiple_episodes[i] += transition.total_return / \
+                                                          self.num_episodes_where_step_has_been_seen[i]
+        self.mean_discounted_return = np.mean(episode_discounted_returns)
+        self.std_discounted_return = np.std(episode_discounted_returns)
+
+    def get_current_episode(self):
+        # we get the episode most of the time from the current episode buffer and only in the last transition from the
+        # "memory" (where is was stored in the end of the episode)
+        return self.memory.get_episode(0) or self.current_episode_buffer
+
+    def train(self):
+        episode = self.get_current_episode()
+
+        # check if we should calculate gradients or skip
+        episode_ended = episode.is_complete
+        num_steps_passed_since_last_update = episode.length() - self.last_gradient_update_step_idx
+        is_t_max_steps_passed = num_steps_passed_since_last_update >= self.ap.algorithm.num_steps_between_gradient_updates
+        if not (is_t_max_steps_passed or episode_ended):
+            return 0
+
+        total_loss = 0
+        if num_steps_passed_since_last_update > 0:
+
+            # we need to update the returns of the episode until now
+            episode.update_returns()
+
+            # get t_max transitions or less if the we got to a terminal state
+            # will be used for both actor-critic and vanilla PG.
+            # # In order to get full episodes, Vanilla PG will set the end_idx to a very big value.
+            transitions = []
+            start_idx = self.last_gradient_update_step_idx
+            end_idx = episode.length()
+
+            for idx in range(start_idx, end_idx):
+                transitions.append(episode.get_transition(idx))
+            self.last_gradient_update_step_idx = end_idx
+
+            # update the statistics for the variance reduction techniques
+            if self.policy_gradient_rescaler in \
+                    [PolicyGradientRescaler.FUTURE_RETURN_NORMALIZED_BY_EPISODE,
+                     PolicyGradientRescaler.FUTURE_RETURN_NORMALIZED_BY_TIMESTEP]:
+                self.update_episode_statistics(episode)
+
+            # accumulate the gradients and apply them once in every apply_gradients_every_x_episodes episodes
+            batch = Batch(transitions)
+            total_loss, losses, unclipped_grads = self.learn_from_batch(batch)
+            if self.current_episode % self.ap.algorithm.apply_gradients_every_x_episodes == 0:
+                for network in self.networks.values():
+                    network.apply_gradients_and_sync_networks()
+            self.training_iteration += 1
+
+        # move the pointer to the next episode start and discard the episode.
+        if episode_ended:
+            # we need to remove the episode, because the next training iteration will be called before storing any
+            # additional transitions in the memory (we don't store a transition for the first call to observe), so the
+            # length of the memory won't be enforced and the old episode won't be removed
+            self.call_memory('remove_episode', 0)
+            self.last_gradient_update_step_idx = 0
+
+        return total_loss
+
+    def learn_from_batch(self, batch):
+        raise NotImplementedError("PolicyOptimizationAgent is an abstract agent. Not to be used directly.")
+
+    def get_prediction(self, states):
+        tf_input_state = self.prepare_batch_for_inference(states, "main")
+        return self.networks['main'].online_network.predict(tf_input_state)
+
+    def choose_action(self, curr_state):
+        # convert to batch so we can run it through the network
+        action_values = self.get_prediction(curr_state)
+        if isinstance(self.spaces.action, DiscreteActionSpace):
+            # DISCRETE
+            action_probabilities = np.array(action_values).squeeze()
+            action = self.exploration_policy.get_action(action_probabilities)
+            action_info = ActionInfo(action=action,
+                                     action_probability=action_probabilities[action])
+
+            self.entropy.add_sample(-np.sum(action_probabilities * np.log(action_probabilities + eps)))
+        elif isinstance(self.spaces.action, BoxActionSpace):
+            # CONTINUOUS
+            action = self.exploration_policy.get_action(action_values)
+
+            action_info = ActionInfo(action=action)
+        else:
+            raise ValueError("The action space of the environment is not compatible with the algorithm")
+        return action_info
diff --git a/rl_coach/agents/ppo_agent.py b/rl_coach/agents/ppo_agent.py
new file mode 100644
index 0000000..8380525
--- /dev/null
+++ b/rl_coach/agents/ppo_agent.py
@@ -0,0 +1,338 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+from collections import OrderedDict
+from typing import Union
+
+import numpy as np
+from rl_coach.agents.actor_critic_agent import ActorCriticAgent
+from rl_coach.agents.policy_optimization_agent import PolicyGradientRescaler
+from rl_coach.architectures.tensorflow_components.heads.v_head import VHeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import AlgorithmParameters, NetworkParameters, \
+    AgentParameters, InputEmbedderParameters, DistributedTaskParameters
+from rl_coach.core_types import EnvironmentSteps, Batch
+from rl_coach.exploration_policies.additive_noise import AdditiveNoiseParameters
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters
+from rl_coach.spaces import DiscreteActionSpace
+from rl_coach.utils import force_list
+
+from rl_coach.architectures.tensorflow_components.heads.ppo_head import PPOHeadParameters
+from rl_coach.logger import screen
+
+
+class PPOCriticNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters(activation_function='tanh')}
+        self.middleware_parameters = FCMiddlewareParameters(activation_function='tanh')
+        self.heads_parameters = [VHeadParameters()]
+        self.loss_weights = [1.0]
+        self.async_training = True
+        self.l2_regularization = 0
+        self.create_target_network = True
+        self.batch_size = 128
+
+
+class PPOActorNetworkParameters(NetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.input_embedders_parameters = {'observation': InputEmbedderParameters(activation_function='tanh')}
+        self.middleware_parameters = FCMiddlewareParameters(activation_function='tanh')
+        self.heads_parameters = [PPOHeadParameters()]
+        self.optimizer_type = 'Adam'
+        self.loss_weights = [1.0]
+        self.async_training = True
+        self.l2_regularization = 0
+        self.create_target_network = True
+        self.batch_size = 128
+
+
+class PPOAlgorithmParameters(AlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.policy_gradient_rescaler = PolicyGradientRescaler.GAE
+        self.gae_lambda = 0.96
+        self.target_kl_divergence = 0.01
+        self.initial_kl_coefficient = 1.0
+        self.high_kl_penalty_coefficient = 1000
+        self.clip_likelihood_ratio_using_epsilon = None
+        self.value_targets_mix_fraction = 0.1
+        self.estimate_state_value_using_gae = True
+        self.step_until_collecting_full_episodes = True
+        self.use_kl_regularization = True
+        self.beta_entropy = 0.01
+        self.num_consecutive_playing_steps = EnvironmentSteps(5000)
+
+
+class PPOAgentParameters(AgentParameters):
+    def __init__(self):
+        super().__init__(algorithm=PPOAlgorithmParameters(),
+                         exploration=AdditiveNoiseParameters(),
+                         memory=EpisodicExperienceReplayParameters(),
+                         networks={"critic": PPOCriticNetworkParameters(), "actor": PPOActorNetworkParameters()})
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.ppo_agent:PPOAgent'
+
+
+# Proximal Policy Optimization - https://arxiv.org/pdf/1707.06347.pdf
+class PPOAgent(ActorCriticAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+
+        # signals definition
+        self.value_loss = self.register_signal('Value Loss')
+        self.policy_loss = self.register_signal('Policy Loss')
+        self.kl_divergence = self.register_signal('KL Divergence')
+        self.total_kl_divergence_during_training_process = 0.0
+        self.unclipped_grads = self.register_signal('Grads (unclipped)')
+
+    def fill_advantages(self, batch):
+        batch = Batch(batch)
+        network_keys = self.ap.network_wrappers['critic'].input_embedders_parameters.keys()
+
+        # * Found not to have any impact *
+        # current_states_with_timestep = self.concat_state_and_timestep(batch)
+
+        current_state_values = self.networks['critic'].online_network.predict(batch.states(network_keys)).squeeze()
+
+        # calculate advantages
+        advantages = []
+        if self.policy_gradient_rescaler == PolicyGradientRescaler.A_VALUE:
+            advantages = batch.total_returns() - current_state_values
+        elif self.policy_gradient_rescaler == PolicyGradientRescaler.GAE:
+            # get bootstraps
+            episode_start_idx = 0
+            advantages = np.array([])
+            # current_state_values[batch.game_overs()] = 0
+            for idx, game_over in enumerate(batch.game_overs()):
+                if game_over:
+                    # get advantages for the rollout
+                    value_bootstrapping = np.zeros((1,))
+                    rollout_state_values = np.append(current_state_values[episode_start_idx:idx+1], value_bootstrapping)
+
+                    rollout_advantages, _ = \
+                        self.get_general_advantage_estimation_values(batch.rewards()[episode_start_idx:idx+1],
+                                                                     rollout_state_values)
+                    episode_start_idx = idx + 1
+                    advantages = np.append(advantages, rollout_advantages)
+        else:
+            screen.warning("WARNING: The requested policy gradient rescaler is not available")
+
+        # standardize
+        advantages = (advantages - np.mean(advantages)) / np.std(advantages)
+
+        # TODO: this will be problematic with a shared memory
+        for transition, advantage in zip(self.memory.transitions, advantages):
+            transition.info['advantage'] = advantage
+
+        self.action_advantages.add_sample(advantages)
+
+    def train_value_network(self, dataset, epochs):
+        loss = []
+        batch = Batch(dataset)
+        network_keys = self.ap.network_wrappers['critic'].input_embedders_parameters.keys()
+
+        # * Found not to have any impact *
+        # add a timestep to the observation
+        # current_states_with_timestep = self.concat_state_and_timestep(dataset)
+
+        mix_fraction = self.ap.algorithm.value_targets_mix_fraction
+        for j in range(epochs):
+            curr_batch_size = batch.size
+            if self.networks['critic'].online_network.optimizer_type != 'LBFGS':
+                curr_batch_size = self.ap.network_wrappers['critic'].batch_size
+            for i in range(batch.size // curr_batch_size):
+                # split to batches for first order optimization techniques
+                current_states_batch = {
+                    k: v[i * curr_batch_size:(i + 1) * curr_batch_size]
+                    for k, v in batch.states(network_keys).items()
+                }
+                total_return_batch = batch.total_returns(True)[i * curr_batch_size:(i + 1) * curr_batch_size]
+                old_policy_values = force_list(self.networks['critic'].target_network.predict(
+                    current_states_batch).squeeze())
+                if self.networks['critic'].online_network.optimizer_type != 'LBFGS':
+                    targets = total_return_batch
+                else:
+                    current_values = self.networks['critic'].online_network.predict(current_states_batch)
+                    targets = current_values * (1 - mix_fraction) + total_return_batch * mix_fraction
+
+                inputs = copy.copy(current_states_batch)
+                for input_index, input in enumerate(old_policy_values):
+                    name = 'output_0_{}'.format(input_index)
+                    if name in self.networks['critic'].online_network.inputs:
+                        inputs[name] = input
+
+                value_loss = self.networks['critic'].online_network.accumulate_gradients(inputs, targets)
+
+                self.networks['critic'].apply_gradients_to_online_network()
+                if isinstance(self.ap.task_parameters, DistributedTaskParameters):
+                    self.networks['critic'].apply_gradients_to_global_network()
+                self.networks['critic'].online_network.reset_accumulated_gradients()
+
+                loss.append([value_loss[0]])
+        loss = np.mean(loss, 0)
+        return loss
+
+    def concat_state_and_timestep(self, dataset):
+        current_states_with_timestep = [np.append(transition.state['observation'], transition.info['timestep'])
+                                        for transition in dataset]
+        current_states_with_timestep = np.expand_dims(current_states_with_timestep, -1)
+        return current_states_with_timestep
+
+    def train_policy_network(self, dataset, epochs):
+        loss = []
+        for j in range(epochs):
+            loss = {
+                'total_loss': [],
+                'policy_losses': [],
+                'unclipped_grads': [],
+                'fetch_result': []
+            }
+            #shuffle(dataset)
+            for i in range(len(dataset) // self.ap.network_wrappers['actor'].batch_size):
+                batch = Batch(dataset[i * self.ap.network_wrappers['actor'].batch_size:
+                                      (i + 1) * self.ap.network_wrappers['actor'].batch_size])
+
+                network_keys = self.ap.network_wrappers['actor'].input_embedders_parameters.keys()
+
+                advantages = batch.info('advantage')
+                actions = batch.actions()
+                if not isinstance(self.spaces.action, DiscreteActionSpace) and len(actions.shape) == 1:
+                    actions = np.expand_dims(actions, -1)
+
+                # get old policy probabilities and distribution
+                old_policy = force_list(self.networks['actor'].target_network.predict(batch.states(network_keys)))
+
+                # calculate gradients and apply on both the local policy network and on the global policy network
+                fetches = [self.networks['actor'].online_network.output_heads[0].kl_divergence,
+                           self.networks['actor'].online_network.output_heads[0].entropy]
+
+                inputs = copy.copy(batch.states(network_keys))
+                inputs['output_0_0'] = actions
+
+                # old_policy_distribution needs to be represented as a list, because in the event of discrete controls,
+                # it has just a mean. otherwise, it has both a mean and standard deviation
+                for input_index, input in enumerate(old_policy):
+                    inputs['output_0_{}'.format(input_index + 1)] = input
+
+                total_loss, policy_losses, unclipped_grads, fetch_result =\
+                    self.networks['actor'].online_network.accumulate_gradients(
+                        inputs, [advantages], additional_fetches=fetches)
+
+                self.networks['actor'].apply_gradients_to_online_network()
+                if isinstance(self.ap.task_parameters, DistributedTaskParameters):
+                    self.networks['actor'].apply_gradients_to_global_network()
+
+                self.networks['actor'].online_network.reset_accumulated_gradients()
+
+                loss['total_loss'].append(total_loss)
+                loss['policy_losses'].append(policy_losses)
+                loss['unclipped_grads'].append(unclipped_grads)
+                loss['fetch_result'].append(fetch_result)
+
+                self.unclipped_grads.add_sample(unclipped_grads)
+
+            for key in loss.keys():
+                loss[key] = np.mean(loss[key], 0)
+
+            if self.ap.network_wrappers['critic'].learning_rate_decay_rate != 0:
+                curr_learning_rate = self.networks['critic'].online_network.get_variable_value(self.ap.learning_rate)
+                self.curr_learning_rate.add_sample(curr_learning_rate)
+            else:
+                curr_learning_rate = self.ap.network_wrappers['critic'].learning_rate
+
+            # log training parameters
+            screen.log_dict(
+                OrderedDict([
+                    ("Surrogate loss", loss['policy_losses'][0]),
+                    ("KL divergence", loss['fetch_result'][0]),
+                    ("Entropy", loss['fetch_result'][1]),
+                    ("training epoch", j),
+                    ("learning_rate", curr_learning_rate)
+                ]),
+                prefix="Policy training"
+            )
+
+        self.total_kl_divergence_during_training_process = loss['fetch_result'][0]
+        self.entropy.add_sample(loss['fetch_result'][1])
+        self.kl_divergence.add_sample(loss['fetch_result'][0])
+        return loss['total_loss']
+
+    def update_kl_coefficient(self):
+        # John Schulman takes the mean kl divergence only over the last epoch which is strange but we will follow
+        # his implementation for now because we know it works well
+        screen.log_title("KL = {}".format(self.total_kl_divergence_during_training_process))
+
+        # update kl coefficient
+        kl_target = self.ap.algorithm.target_kl_divergence
+        kl_coefficient = self.networks['actor'].online_network.get_variable_value(
+            self.networks['actor'].online_network.output_heads[0].kl_coefficient)
+        new_kl_coefficient = kl_coefficient
+        if self.total_kl_divergence_during_training_process > 1.3 * kl_target:
+            # kl too high => increase regularization
+            new_kl_coefficient *= 1.5
+        elif self.total_kl_divergence_during_training_process < 0.7 * kl_target:
+            # kl too low => decrease regularization
+            new_kl_coefficient /= 1.5
+
+        # update the kl coefficient variable
+        if kl_coefficient != new_kl_coefficient:
+            self.networks['actor'].online_network.set_variable_value(
+                self.networks['actor'].online_network.output_heads[0].assign_kl_coefficient,
+                new_kl_coefficient,
+                self.networks['actor'].online_network.output_heads[0].kl_coefficient_ph)
+
+        screen.log_title("KL penalty coefficient change = {} -> {}".format(kl_coefficient, new_kl_coefficient))
+
+    def post_training_commands(self):
+        if self.ap.algorithm.use_kl_regularization:
+            self.update_kl_coefficient()
+
+        # clean memory
+        self.call_memory('clean')
+
+    def train(self):
+        loss = 0
+        if self._should_train(wait_for_full_episode=True):
+            for training_step in range(self.ap.algorithm.num_consecutive_training_steps):
+                self.networks['actor'].sync()
+                self.networks['critic'].sync()
+
+                dataset = self.memory.transitions
+
+                self.fill_advantages(dataset)
+
+                # take only the requested number of steps
+                dataset = dataset[:self.ap.algorithm.num_consecutive_playing_steps.num_steps]
+
+                value_loss = self.train_value_network(dataset, 1)
+                policy_loss = self.train_policy_network(dataset, 10)
+
+                self.value_loss.add_sample(value_loss)
+                self.policy_loss.add_sample(policy_loss)
+
+            self.post_training_commands()
+            self.training_iteration += 1
+            self.update_log()  # should be done in order to update the data that has been accumulated * while not playing *
+            return np.append(value_loss, policy_loss)
+
+    def get_prediction(self, states):
+        tf_input_state = self.prepare_batch_for_inference(states, "actor")
+        return self.networks['actor'].online_network.predict(tf_input_state)
diff --git a/rl_coach/agents/qr_dqn_agent.py b/rl_coach/agents/qr_dqn_agent.py
new file mode 100644
index 0000000..7a3cdc1
--- /dev/null
+++ b/rl_coach/agents/qr_dqn_agent.py
@@ -0,0 +1,112 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.architectures.tensorflow_components.heads.quantile_regression_q_head import QuantileRegressionQHeadParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters, DQNNetworkParameters, DQNAlgorithmParameters
+from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent
+from rl_coach.core_types import StateType
+
+
+class QuantileRegressionDQNNetworkParameters(DQNNetworkParameters):
+    def __init__(self):
+        super().__init__()
+        self.heads_parameters = [QuantileRegressionQHeadParameters()]
+        self.learning_rate = 0.00005
+        self.optimizer_epsilon = 0.01 / 32
+
+
+class QuantileRegressionDQNAlgorithmParameters(DQNAlgorithmParameters):
+    def __init__(self):
+        super().__init__()
+        self.atoms = 200
+        self.huber_loss_interval = 1  # called k in the paper
+
+
+class QuantileRegressionDQNAgentParameters(DQNAgentParameters):
+    def __init__(self):
+        super().__init__()
+        self.algorithm = QuantileRegressionDQNAlgorithmParameters()
+        self.network_wrappers = {"main": QuantileRegressionDQNNetworkParameters()}
+        self.exploration.epsilon_schedule = LinearSchedule(1, 0.01, 1000000)
+        self.exploration.evaluation_epsilon = 0.001
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.qr_dqn_agent:QuantileRegressionDQNAgent'
+
+
+# Quantile Regression Deep Q Network - https://arxiv.org/pdf/1710.10044v1.pdf
+class QuantileRegressionDQNAgent(ValueOptimizationAgent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.quantile_probabilities = np.ones(self.ap.algorithm.atoms) / float(self.ap.algorithm.atoms)
+
+    def get_q_values(self, quantile_values):
+        return np.dot(quantile_values, self.quantile_probabilities)
+
+    # prediction's format is (batch,actions,atoms)
+    def get_all_q_values_for_states(self, states: StateType):
+        if self.exploration_policy.requires_action_values():
+            quantile_values = self.get_prediction(states)
+            actions_q_values = self.get_q_values(quantile_values)
+        else:
+            actions_q_values = None
+        return actions_q_values
+
+    def learn_from_batch(self, batch):
+        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()
+
+        # get the quantiles of the next states and current states
+        next_state_quantiles, current_quantiles = self.networks['main'].parallel_prediction([
+            (self.networks['main'].target_network, batch.next_states(network_keys)),
+            (self.networks['main'].online_network, batch.states(network_keys))
+        ])
+
+        # get the optimal actions to take for the next states
+        target_actions = np.argmax(self.get_q_values(next_state_quantiles), axis=1)
+
+        # calculate the Bellman update
+        batch_idx = list(range(self.ap.network_wrappers['main'].batch_size))
+
+        TD_targets = batch.rewards(True) + (1.0 - batch.game_overs(True)) * self.ap.algorithm.discount \
+                               * next_state_quantiles[batch_idx, target_actions]
+
+        # get the locations of the selected actions within the batch for indexing purposes
+        actions_locations = [[b, a] for b, a in zip(batch_idx, batch.actions())]
+
+        # calculate the cumulative quantile probabilities and reorder them to fit the sorted quantiles order
+        cumulative_probabilities = np.array(range(self.ap.algorithm.atoms + 1)) / float(self.ap.algorithm.atoms) # tau_i
+        quantile_midpoints = 0.5*(cumulative_probabilities[1:] + cumulative_probabilities[:-1])  # tau^hat_i
+        quantile_midpoints = np.tile(quantile_midpoints, (self.ap.network_wrappers['main'].batch_size, 1))
+        sorted_quantiles = np.argsort(current_quantiles[batch_idx, batch.actions()])
+        for idx in range(self.ap.network_wrappers['main'].batch_size):
+            quantile_midpoints[idx, :] = quantile_midpoints[idx, sorted_quantiles[idx]]
+
+        # train
+        result = self.networks['main'].train_and_sync_networks({
+            **batch.states(network_keys),
+            'output_0_0': actions_locations,
+            'output_0_1': quantile_midpoints,
+        }, TD_targets)
+        total_loss, losses, unclipped_grads = result[:3]
+
+        return total_loss, losses, unclipped_grads
+
diff --git a/rl_coach/agents/value_optimization_agent.py b/rl_coach/agents/value_optimization_agent.py
new file mode 100644
index 0000000..afd242b
--- /dev/null
+++ b/rl_coach/agents/value_optimization_agent.py
@@ -0,0 +1,98 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.memories.non_episodic.prioritized_experience_replay import PrioritizedExperienceReplay
+from rl_coach.spaces import DiscreteActionSpace
+
+from rl_coach.agents.agent import Agent
+from rl_coach.core_types import ActionInfo, StateType
+
+
+## This is an abstract agent - there is no learn_from_batch method ##
+
+
+class ValueOptimizationAgent(Agent):
+    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):
+        super().__init__(agent_parameters, parent)
+        self.q_values = self.register_signal("Q")
+        self.q_value_for_action = {}
+
+    def init_environment_dependent_modules(self):
+        super().init_environment_dependent_modules()
+        if isinstance(self.spaces.action, DiscreteActionSpace):
+            for i in range(len(self.spaces.action.actions)):
+                self.q_value_for_action[i] = self.register_signal("Q for action {}".format(i),
+                                                                  dump_one_value_per_episode=False,
+                                                                  dump_one_value_per_step=True)
+
+    # Algorithms for which q_values are calculated from predictions will override this function
+    def get_all_q_values_for_states(self, states: StateType):
+        if self.exploration_policy.requires_action_values():
+            actions_q_values = self.get_prediction(states)
+        else:
+            actions_q_values = None
+        return actions_q_values
+
+    def get_prediction(self, states):
+        return self.networks['main'].online_network.predict(self.prepare_batch_for_inference(states, 'main'))
+
+    def update_transition_priorities_and_get_weights(self, TD_errors, batch):
+        # update errors in prioritized replay buffer
+        importance_weights = None
+        if isinstance(self.memory, PrioritizedExperienceReplay):
+            self.call_memory('update_priorities', (batch.info('idx'), TD_errors))
+            importance_weights = batch.info('weight')
+        return importance_weights
+
+    def _validate_action(self, policy, action):
+        if np.array(action).shape != ():
+            raise ValueError((
+                'The exploration_policy {} returned a vector of actions '
+                'instead of a single action. ValueOptimizationAgents '
+                'require exploration policies which return a single action.'
+            ).format(policy.__class__.__name__))
+
+    def choose_action(self, curr_state):
+        actions_q_values = self.get_all_q_values_for_states(curr_state)
+
+        # choose action according to the exploration policy and the current phase (evaluating or training the agent)
+        action = self.exploration_policy.get_action(actions_q_values)
+        self._validate_action(self.exploration_policy, action)
+
+        if actions_q_values is not None:
+            # this is for bootstrapped dqn
+            if type(actions_q_values) == list and len(actions_q_values) > 0:
+                actions_q_values = self.exploration_policy.last_action_values
+            actions_q_values = actions_q_values.squeeze()
+
+            # store the q values statistics for logging
+            self.q_values.add_sample(actions_q_values)
+            for i, q_value in enumerate(actions_q_values):
+                self.q_value_for_action[i].add_sample(q_value)
+
+            action_info = ActionInfo(action=action,
+                                     action_value=actions_q_values[action],
+                                     max_action_value=np.max(actions_q_values))
+        else:
+            action_info = ActionInfo(action=action)
+
+        return action_info
+
+    def learn_from_batch(self, batch):
+        raise NotImplementedError("ValueOptimizationAgent is an abstract agent. Not to be used directly.")
diff --git a/exploration_policies/greedy.py b/rl_coach/architectures/__init__.py
similarity index 55%
rename from exploration_policies/greedy.py
rename to rl_coach/architectures/__init__.py
index 34acf1b..cf26739 100644
--- a/exploration_policies/greedy.py
+++ b/rl_coach/architectures/__init__.py
@@ -13,20 +13,3 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-
-from exploration_policies.exploration_policy import *
-
-
-class Greedy(ExplorationPolicy):
-    def __init__(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        """
-        ExplorationPolicy.__init__(self, tuning_parameters)
-
-    def get_action(self, action_values):
-        return np.argmax(action_values)
-
-    def get_control_param(self):
-        return 0
diff --git a/architectures/architecture.py b/rl_coach/architectures/architecture.py
similarity index 54%
rename from architectures/architecture.py
rename to rl_coach/architectures/architecture.py
index d3175b7..2d0377a 100644
--- a/architectures/architecture.py
+++ b/rl_coach/architectures/architecture.py
@@ -14,35 +14,30 @@
 # limitations under the License.
 #
 
-from configurations import Preset
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import SpacesDefinition
 
 
 class Architecture(object):
-    def __init__(self, tuning_parameters, name=""):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, name: str= ""):
         """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
-        :param name: The name of the network
-        :param name: string
+        :param agent_parameters: the agent parameters
+        :param spaces: the spaces (observation, action, etc.) definition of the agent
+        :param name: the name of the network
         """
-        self.batch_size = tuning_parameters.batch_size
-        self.input_depth = tuning_parameters.env.observation_stack_size
-        self.input_height = tuning_parameters.env.desired_observation_height
-        self.input_width = tuning_parameters.env.desired_observation_width
-        self.num_actions = tuning_parameters.env.action_space_size
-        self.measurements_size = tuning_parameters.env.measurements_size \
-            if tuning_parameters.env.measurements_size else 0
-        self.learning_rate = tuning_parameters.learning_rate
-        self.optimizer = None
-        self.name = name
-        self.tp = tuning_parameters
+        # spaces
+        self.spaces = spaces
 
-    def get_model(self, tuning_parameters):
-        """
-        :param tuning_parameters: A Preset class instance with all the running parameters
-        :type tuning_parameters: Preset
-        :return: A model
-        """
+        self.name = name
+        self.network_wrapper_name = self.name.split('/')[0]  # the name can be main/online and the network_wrapper_name will be main
+        self.full_name = "{}/{}".format(agent_parameters.full_name_id, name)
+        self.network_parameters = agent_parameters.network_wrappers[self.network_wrapper_name]
+        self.batch_size = self.network_parameters.batch_size
+        self.learning_rate = self.network_parameters.learning_rate
+        self.optimizer = None
+        self.ap = agent_parameters
+
+    def get_model(self):
         pass
 
     def predict(self, inputs):
@@ -73,4 +68,4 @@ class Architecture(object):
         pass
 
     def set_variable_value(self, assign_op, value, placeholder=None):
-        pass
\ No newline at end of file
+        pass
diff --git a/rl_coach/architectures/network_wrapper.py b/rl_coach/architectures/network_wrapper.py
new file mode 100644
index 0000000..6f21f9c
--- /dev/null
+++ b/rl_coach/architectures/network_wrapper.py
@@ -0,0 +1,210 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List, Tuple
+
+from rl_coach.base_parameters import Frameworks, AgentParameters
+from rl_coach.spaces import SpacesDefinition
+
+from rl_coach.logger import failed_imports
+
+try:
+    import tensorflow as tf
+    from rl_coach.architectures.tensorflow_components.general_network import GeneralTensorFlowNetwork
+except ImportError:
+    failed_imports.append("TensorFlow")
+
+
+class NetworkWrapper(object):
+    """
+    Contains multiple networks and managers syncing and gradient updates
+    between them.
+    """
+    def __init__(self, agent_parameters: AgentParameters, has_target: bool, has_global: bool, name: str,
+                 spaces: SpacesDefinition, replicated_device=None, worker_device=None):
+        self.ap = agent_parameters
+        self.network_parameters = self.ap.network_wrappers[name]
+        self.has_target = has_target
+        self.has_global = has_global
+        self.name = name
+        self.sess = None
+
+        if self.network_parameters.framework == Frameworks.tensorflow:
+            general_network = GeneralTensorFlowNetwork
+        else:
+            raise Exception("{} Framework is not supported"
+                            .format(Frameworks().to_string(self.network_parameters.framework)))
+
+        with tf.variable_scope("{}/{}".format(self.ap.full_name_id, name)):
+
+            # Global network - the main network shared between threads
+            self.global_network = None
+            if self.has_global:
+                # we assign the parameters of this network on the parameters server
+                with tf.device(replicated_device):
+                    self.global_network = general_network(agent_parameters=agent_parameters,
+                                                          name='{}/global'.format(name),
+                                                          global_network=None,
+                                                          network_is_local=False,
+                                                          spaces=spaces,
+                                                          network_is_trainable=True)
+
+            # Online network - local copy of the main network used for playing
+            self.online_network = None
+            with tf.device(worker_device):
+                self.online_network = general_network(agent_parameters=agent_parameters,
+                                                      name='{}/online'.format(name),
+                                                      global_network=self.global_network,
+                                                      network_is_local=True,
+                                                      spaces=spaces,
+                                                      network_is_trainable=True)
+
+            # Target network - a local, slow updating network used for stabilizing the learning
+            self.target_network = None
+            if self.has_target:
+                with tf.device(worker_device):
+                    self.target_network = general_network(agent_parameters=agent_parameters,
+                                                          name='{}/target'.format(name),
+                                                          global_network=self.global_network,
+                                                          network_is_local=True,
+                                                          spaces=spaces,
+                                                          network_is_trainable=False)
+
+    def sync(self):
+        """
+        Initializes the weights of the networks to match each other
+        :return:
+        """
+        self.update_online_network()
+        self.update_target_network()
+
+    def update_target_network(self, rate=1.0):
+        """
+        Copy weights: online network >>> target network
+        :param rate: the rate of copying the weights - 1 for copying exactly
+        """
+        if self.target_network:
+            self.target_network.set_weights(self.online_network.get_weights(), rate)
+
+    def update_online_network(self, rate=1.0):
+        """
+        Copy weights: global network >>> online network
+        :param rate: the rate of copying the weights - 1 for copying exactly
+        """
+        if self.global_network:
+            self.online_network.set_weights(self.global_network.get_weights(), rate)
+
+    def apply_gradients_to_global_network(self, gradients=None):
+        """
+        Apply gradients from the online network on the global network
+        :param gradients: optional gradients that will be used instead of teh accumulated gradients
+        :return:
+        """
+        if gradients is None:
+            gradients = self.online_network.accumulated_gradients
+        if self.network_parameters.shared_optimizer:
+            self.global_network.apply_gradients(gradients)
+        else:
+            self.online_network.apply_gradients(gradients)
+
+    def apply_gradients_to_online_network(self, gradients=None):
+        """
+        Apply gradients from the online network on itself
+        :return:
+        """
+        if gradients is None:
+            gradients = self.online_network.accumulated_gradients
+        self.online_network.apply_gradients(gradients)
+
+    def train_and_sync_networks(self, inputs, targets, additional_fetches=[], importance_weights=None):
+        """
+        A generic training function that enables multi-threading training using a global network if necessary.
+        :param inputs: The inputs for the network.
+        :param targets: The targets corresponding to the given inputs
+        :param additional_fetches: Any additional tensor the user wants to fetch
+        :param importance_weights: A coefficient for each sample in the batch, which will be used to rescale the loss
+                                   error of this sample. If it is not given, the samples losses won't be scaled
+        :return: The loss of the training iteration
+        """
+        result = self.online_network.accumulate_gradients(inputs, targets, additional_fetches=additional_fetches,
+                                                          importance_weights=importance_weights, no_accumulation=True)
+        self.apply_gradients_and_sync_networks(reset_gradients=False)
+        return result
+
+    def apply_gradients_and_sync_networks(self, reset_gradients=True):
+        """
+        Applies the gradients accumulated in the online network to the global network or to itself and syncs the
+        networks if necessary
+        :param reset_gradients: If set to True, the accumulated gradients wont be reset to 0 after applying them to
+                                the network. this is useful when the accumulated gradients are overwritten instead
+                                if accumulated by the accumulate_gradients function. this allows reducing time
+                                complexity for this function by around 10%
+        """
+        if self.global_network:
+            self.apply_gradients_to_global_network()
+            if reset_gradients:
+                self.online_network.reset_accumulated_gradients()
+            self.update_online_network()
+        else:
+            if reset_gradients:
+                self.online_network.apply_and_reset_gradients(self.online_network.accumulated_gradients)
+            else:
+                self.online_network.apply_gradients(self.online_network.accumulated_gradients)
+
+    def parallel_prediction(self, network_input_tuples: List[Tuple]):
+        """
+        Run several network prediction in parallel. Currently this only supports running each of the network once.
+        :param network_input_tuples: a list of tuples where the first element is the network (online_network,
+                                     target_network or global_network) and the second element is the inputs
+        :return: the outputs of all the networks in the same order as the inputs were given
+        """
+        feed_dict = {}
+        fetches = []
+
+        for idx, (network, input) in enumerate(network_input_tuples):
+            feed_dict.update(network.create_feed_dict(input))
+            fetches += network.outputs
+
+        outputs = self.sess.run(fetches, feed_dict)
+
+        return outputs
+
+    def get_local_variables(self):
+        """
+        Get all the variables that are local to the thread
+        :return: a list of all the variables that are local to the thread
+        """
+        local_variables = [v for v in tf.local_variables() if self.online_network.name in v.name]
+        if self.has_target:
+            local_variables += [v for v in tf.local_variables() if self.target_network.name in v.name]
+        return local_variables
+
+    def get_global_variables(self):
+        """
+        Get all the variables that are shared between threads
+        :return: a list of all the variables that are shared between threads
+        """
+        global_variables = [v for v in tf.global_variables() if self.global_network.name in v.name]
+        return global_variables
+
+    def set_session(self, sess):
+        self.sess = sess
+        self.online_network.set_session(sess)
+        if self.global_network:
+            self.global_network.set_session(sess)
+        if self.target_network:
+            self.target_network.set_session(sess)
+
diff --git a/rl_coach/architectures/tensorflow_components/__init__.py b/rl_coach/architectures/tensorflow_components/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/architectures/tensorflow_components/architecture.py b/rl_coach/architectures/tensorflow_components/architecture.py
new file mode 100644
index 0000000..6caf6c8
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/architecture.py
@@ -0,0 +1,664 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import time
+from typing import List
+
+import numpy as np
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters, DistributedTaskParameters
+from rl_coach.spaces import SpacesDefinition
+from rl_coach.utils import force_list, squeeze_list
+
+from rl_coach.architectures.architecture import Architecture
+from rl_coach.core_types import GradientClippingMethod
+
+
+def batchnorm_activation_dropout(input_layer, batchnorm, activation_function, dropout, dropout_rate, layer_idx):
+    layers = [input_layer]
+
+    # batchnorm
+    if batchnorm:
+        layers.append(
+            tf.layers.batch_normalization(layers[-1], name="batchnorm{}".format(layer_idx))
+        )
+
+    # activation
+    if activation_function:
+        layers.append(
+            activation_function(layers[-1], name="activation{}".format(layer_idx))
+        )
+
+    # dropout
+    if dropout:
+        layers.append(
+            tf.layers.dropout(layers[-1], dropout_rate, name="dropout{}".format(layer_idx))
+        )
+
+    # remove the input layer from the layers list
+    del layers[0]
+
+    return layers
+
+
+class Conv2d(object):
+    def __init__(self, params: List):
+        """
+        :param params: list of [num_filters, kernel_size, strides]
+        """
+        self.params = params
+
+    def __call__(self, input_layer, name: str):
+        """
+        returns a tensorflow conv2d layer
+        :param input_layer: previous layer
+        :param name: layer name
+        :return: conv2d layer
+        """
+        return tf.layers.conv2d(input_layer, filters=self.params[0], kernel_size=self.params[1], strides=self.params[2],
+                                data_format='channels_last', name=name)
+
+
+class Dense(object):
+    def __init__(self, params: List):
+        """
+        :param params: list of [num_output_neurons]
+        """
+        self.params = params
+
+    def __call__(self, input_layer, name: str):
+        """
+        returns a tensorflow dense layer
+        :param input_layer: previous layer
+        :param name: layer name
+        :return: dense layer
+        """
+        return tf.layers.dense(input_layer, self.params[0], name=name)
+
+
+def variable_summaries(var):
+    """Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
+    with tf.name_scope('summaries'):
+        layer_weight_name = '_'.join(var.name.split('/')[-3:])[:-2]
+
+        with tf.name_scope(layer_weight_name):
+            mean = tf.reduce_mean(var)
+            tf.summary.scalar('mean', mean)
+            with tf.name_scope('stddev'):
+                stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
+            tf.summary.scalar('stddev', stddev)
+            tf.summary.scalar('max', tf.reduce_max(var))
+            tf.summary.scalar('min', tf.reduce_min(var))
+            tf.summary.histogram('histogram', var)
+
+
+def local_getter(getter, name, *args, **kwargs):
+    """
+    This is a wrapper around the tf.get_variable function which puts the variables in the local variables collection
+    instead of the global variables collection. The local variables collection will hold variables which are not shared
+    between workers. these variables are also assumed to be non-trainable (the optimizer does not apply gradients to
+    these variables), but we can calculate the gradients wrt these variables, and we can update their content.
+    """
+    kwargs['collections'] = [tf.GraphKeys.LOCAL_VARIABLES]
+    return getter(name, *args, **kwargs)
+
+
+class TensorFlowArchitecture(Architecture):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, name: str= "",
+                 global_network=None, network_is_local: bool=True, network_is_trainable: bool=False):
+        """
+        :param agent_parameters: the agent parameters
+        :param spaces: the spaces definition of the agent
+        :param name: the name of the network
+        :param global_network: the global network replica that is shared between all the workers
+        :param network_is_local: is the network global (shared between workers) or local (dedicated to the worker)
+        :param network_is_trainable: is the network trainable (we can apply gradients on it)
+        """
+        super().__init__(agent_parameters, spaces, name)
+        self.middleware = None
+        self.network_is_local = network_is_local
+        self.global_network = global_network
+        if not self.network_parameters.tensorflow_support:
+            raise ValueError('TensorFlow is not supported for this agent')
+        self.sess = None
+        self.inputs = {}
+        self.outputs = []
+        self.targets = []
+        self.importance_weights = []
+        self.losses = []
+        self.total_loss = None
+        self.trainable_weights = []
+        self.weights_placeholders = []
+        self.shared_accumulated_gradients = []
+        self.curr_rnn_c_in = None
+        self.curr_rnn_h_in = None
+        self.gradients_wrt_inputs = []
+        self.train_writer = None
+        self.accumulated_gradients = None
+        self.network_is_trainable = network_is_trainable
+
+        self.is_chief = self.ap.task_parameters.task_index == 0
+        self.network_is_global = not self.network_is_local and global_network is None
+        self.distributed_training = self.network_is_global or self.network_is_local and global_network is not None
+
+        self.optimizer_type = self.network_parameters.optimizer_type
+        if self.ap.task_parameters.seed is not None:
+            tf.set_random_seed(self.ap.task_parameters.seed)
+        with tf.variable_scope("/".join(self.name.split("/")[1:]), initializer=tf.contrib.layers.xavier_initializer(),
+                               custom_getter=local_getter if network_is_local and global_network else None):
+            self.global_step = tf.train.get_or_create_global_step()
+
+            # build the network
+            self.get_model()
+
+            # model weights
+            self.weights = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope=self.full_name)
+
+            # create the placeholder for the assigning gradients and some tensorboard summaries for the weights
+            for idx, var in enumerate(self.weights):
+                placeholder = tf.placeholder(tf.float32, shape=var.get_shape(), name=str(idx) + '_holder')
+                self.weights_placeholders.append(placeholder)
+                if self.ap.visualization.tensorboard:
+                    variable_summaries(var)
+
+            # create op for assigning a list of weights to the network weights
+            self.update_weights_from_list = [weights.assign(holder) for holder, weights in
+                                             zip(self.weights_placeholders, self.weights)]
+
+            # locks for synchronous training
+            if self.network_is_global:
+                self._create_locks_for_synchronous_training()
+
+            # gradients ops
+            self._create_gradient_ops()
+
+            # L2 regularization
+            if self.network_parameters.l2_regularization != 0:
+                self.l2_regularization = [tf.add_n([tf.nn.l2_loss(v) for v in self.weights])
+                                          * self.network_parameters.l2_regularization]
+                tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, self.l2_regularization)
+
+            self.inc_step = self.global_step.assign_add(1)
+
+            # reset LSTM hidden cells
+            self.reset_internal_memory()
+
+            if self.ap.visualization.tensorboard:
+                current_scope_summaries = tf.get_collection(tf.GraphKeys.SUMMARIES,
+                                                            scope=tf.contrib.framework.get_name_scope())
+                self.merged = tf.summary.merge(current_scope_summaries)
+
+            # initialize or restore model
+            self.init_op = tf.group(
+                tf.global_variables_initializer(),
+                tf.local_variables_initializer()
+            )
+
+            # set the fetches for training
+            self._set_initial_fetch_list()
+
+    def _set_initial_fetch_list(self):
+        """
+        Create an initial list of tensors to fetch in each training iteration
+        :return: None
+        """
+        self.train_fetches = [self.gradients_norm]
+        if self.network_parameters.clip_gradients:
+            self.train_fetches.append(self.clipped_grads)
+        else:
+            self.train_fetches.append(self.tensor_gradients)
+        self.train_fetches += [self.total_loss, self.losses]
+        if self.middleware.__class__.__name__ == 'LSTMMiddleware':
+            self.train_fetches.append(self.middleware.state_out)
+        self.additional_fetches_start_idx = len(self.train_fetches)
+
+    def _create_locks_for_synchronous_training(self):
+        """
+        Create locks for synchronizing the different workers during training
+        :return: None
+        """
+        self.lock_counter = tf.get_variable("lock_counter", [], tf.int32,
+                                            initializer=tf.constant_initializer(0, dtype=tf.int32),
+                                            trainable=False)
+        self.lock = self.lock_counter.assign_add(1, use_locking=True)
+        self.lock_init = self.lock_counter.assign(0)
+
+        self.release_counter = tf.get_variable("release_counter", [], tf.int32,
+                                               initializer=tf.constant_initializer(0, dtype=tf.int32),
+                                               trainable=False)
+        self.release = self.release_counter.assign_add(1, use_locking=True)
+        self.release_decrement = self.release_counter.assign_add(-1, use_locking=True)
+        self.release_init = self.release_counter.assign(0)
+
+    def _create_gradient_ops(self):
+        """
+        Create all the tensorflow operations for calculating gradients, processing the gradients and applying them
+        :return: None
+        """
+
+        self.tensor_gradients = tf.gradients(self.total_loss, self.weights)
+        self.gradients_norm = tf.global_norm(self.tensor_gradients)
+
+        # gradient clipping
+        if self.network_parameters.clip_gradients is not None and self.network_parameters.clip_gradients != 0:
+            self._create_gradient_clipping_ops()
+
+        # when using a shared optimizer, we create accumulators to store gradients from all the workers before
+        # applying them
+        if self.distributed_training:
+            self._create_gradient_accumulators()
+
+        # gradients of the outputs w.r.t. the inputs
+        # at the moment, this is only used by ddpg
+        self.gradients_wrt_inputs = [{name: tf.gradients(output, input_ph) for name, input_ph in
+                                      self.inputs.items()} for output in self.outputs]
+        self.gradients_weights_ph = [tf.placeholder('float32', self.outputs[i].shape, 'output_gradient_weights')
+                                     for i in range(len(self.outputs))]
+        self.weighted_gradients = []
+        for i in range(len(self.outputs)):
+            unnormalized_gradients = tf.gradients(self.outputs[i], self.weights, self.gradients_weights_ph[i])
+            # unnormalized gradients seems to be better at the time. TODO: validate this accross more environments
+            # self.weighted_gradients.append(list(map(lambda x: tf.div(x, self.network_parameters.batch_size),
+            #                                         unnormalized_gradients)))
+            self.weighted_gradients.append(unnormalized_gradients)
+
+        # defining the optimization process (for LBFGS we have less control over the optimizer)
+        if self.optimizer_type != 'LBFGS' and self.network_is_trainable:
+            self._create_gradient_applying_ops()
+
+    def _create_gradient_accumulators(self):
+        if self.network_is_global:
+            self.shared_accumulated_gradients = [tf.Variable(initial_value=tf.zeros_like(var)) for var in self.weights]
+            self.accumulate_shared_gradients = [var.assign_add(holder, use_locking=True) for holder, var in
+                                                zip(self.weights_placeholders, self.shared_accumulated_gradients)]
+            self.init_shared_accumulated_gradients = [var.assign(tf.zeros_like(var)) for var in
+                                                      self.shared_accumulated_gradients]
+        elif self.network_is_local:
+            self.accumulate_shared_gradients = self.global_network.accumulate_shared_gradients
+            self.init_shared_accumulated_gradients = self.global_network.init_shared_accumulated_gradients
+
+    def _create_gradient_clipping_ops(self):
+        """
+        Create tensorflow ops for clipping the gradients according to the given GradientClippingMethod
+        :return: None
+        """
+        if self.network_parameters.gradients_clipping_method == GradientClippingMethod.ClipByGlobalNorm:
+            self.clipped_grads, self.grad_norms = tf.clip_by_global_norm(self.tensor_gradients,
+                                                                         self.network_parameters.clip_gradients)
+        elif self.network_parameters.gradients_clipping_method == GradientClippingMethod.ClipByValue:
+            self.clipped_grads = [tf.clip_by_value(grad,
+                                                   -self.network_parameters.clip_gradients,
+                                                   self.network_parameters.clip_gradients)
+                                  for grad in self.tensor_gradients]
+        elif self.network_parameters.gradients_clipping_method == GradientClippingMethod.ClipByNorm:
+            self.clipped_grads = [tf.clip_by_norm(grad, self.network_parameters.clip_gradients)
+                                  for grad in self.tensor_gradients]
+
+    def _create_gradient_applying_ops(self):
+        """
+        Create tensorflow ops for applying the gradients to the network weights according to the training scheme
+        (distributed training - local or global network, shared optimizer, etc.)
+        :return: None
+        """
+        if self.network_is_global and self.network_parameters.shared_optimizer and \
+                not self.network_parameters.async_training:
+            # synchronous training with shared optimizer? -> create an operation for applying the gradients
+            # accumulated in the shared gradients accumulator
+            self.update_weights_from_shared_gradients = self.optimizer.apply_gradients(
+                zip(self.shared_accumulated_gradients, self.weights),
+                global_step=self.global_step)
+
+        elif self.distributed_training and self.network_is_local:
+            # distributed training but independent optimizer? -> create an operation for applying the gradients
+            # to the global weights
+            self.update_weights_from_batch_gradients = self.optimizer.apply_gradients(
+                zip(self.weights_placeholders, self.global_network.weights), global_step=self.global_step)
+
+        elif self.network_is_trainable:
+            # not any of the above but is trainable? -> create an operation for applying the gradients to
+            # this network weights
+            self.update_weights_from_batch_gradients = self.optimizer.apply_gradients(
+                zip(self.weights_placeholders, self.weights), global_step=self.global_step)
+
+    def set_session(self, sess):
+        self.sess = sess
+
+        task_is_distributed = isinstance(self.ap.task_parameters, DistributedTaskParameters)
+        # initialize the session parameters in single threaded runs. Otherwise, this is done through the
+        # MonitoredSession object in the graph manager
+        if not task_is_distributed:
+            self.sess.run(self.init_op)
+
+        if self.ap.visualization.tensorboard:
+            # Write the merged summaries to the current experiment directory
+            if not task_is_distributed:
+                self.train_writer = tf.summary.FileWriter(self.ap.task_parameters.experiment_path + '/tensorboard')
+                self.train_writer.add_graph(self.sess.graph)
+            elif self.network_is_local:
+                self.train_writer = tf.summary.FileWriter(self.ap.task_parameters.experiment_path +
+                                                          '/tensorboard/worker{}'.format(self.ap.task_parameters.task_index))
+                self.train_writer.add_graph(self.sess.graph)
+
+        # wait for all the workers to set their session
+        if not self.network_is_local:
+            self.wait_for_all_workers_barrier()
+
+    def reset_accumulated_gradients(self):
+        """
+        Reset the gradients accumulation placeholder
+        """
+        if self.accumulated_gradients is None:
+            self.accumulated_gradients = self.sess.run(self.weights)
+
+        for ix, grad in enumerate(self.accumulated_gradients):
+            self.accumulated_gradients[ix] = grad * 0
+
+    def accumulate_gradients(self, inputs, targets, additional_fetches=None, importance_weights=None,
+                             no_accumulation=False):
+        """
+        Runs a forward pass & backward pass, clips gradients if needed and accumulates them into the accumulation
+        placeholders
+        :param additional_fetches: Optional tensors to fetch during gradients calculation
+        :param inputs: The input batch for the network
+        :param targets: The targets corresponding to the input batch
+        :param importance_weights: A coefficient for each sample in the batch, which will be used to rescale the loss
+                                   error of this sample. If it is not given, the samples losses won't be scaled
+        :param no_accumulation: If is set to True, the gradients in the accumulated gradients placeholder will be
+                                replaced by the newely calculated gradients instead of accumulating the new gradients.
+                                This can speed up the function runtime by around 10%.
+        :return: A list containing the total loss and the individual network heads losses
+        """
+
+        if self.accumulated_gradients is None:
+            self.reset_accumulated_gradients()
+
+        # feed inputs
+        if additional_fetches is None:
+            additional_fetches = []
+        feed_dict = self.create_feed_dict(inputs)
+
+        # feed targets
+        targets = force_list(targets)
+        for placeholder_idx, target in enumerate(targets):
+            feed_dict[self.targets[placeholder_idx]] = target
+
+        # feed importance weights
+        importance_weights = force_list(importance_weights)
+        for placeholder_idx, target_ph in enumerate(targets):
+            if len(importance_weights) <= placeholder_idx or importance_weights[placeholder_idx] is None:
+                importance_weight = np.ones(target_ph.shape[0])
+            else:
+                importance_weight = importance_weights[placeholder_idx]
+            importance_weight = np.reshape(importance_weight, (-1,) + (1,)*(len(target_ph.shape)-1))
+
+            feed_dict[self.importance_weights[placeholder_idx]] = importance_weight
+
+        if self.optimizer_type != 'LBFGS':
+
+            # feed the lstm state if necessary
+            if self.middleware.__class__.__name__ == 'LSTMMiddleware':
+                # we can't always assume that we are starting from scratch here can we?
+                feed_dict[self.middleware.c_in] = self.middleware.c_init
+                feed_dict[self.middleware.h_in] = self.middleware.h_init
+
+            fetches = self.train_fetches + additional_fetches
+            if self.ap.visualization.tensorboard:
+                fetches += [self.merged]
+
+            # get grads
+            result = self.sess.run(fetches, feed_dict=feed_dict)
+            if hasattr(self, 'train_writer') and self.train_writer is not None:
+                self.train_writer.add_summary(result[-1], self.sess.run(self.global_step))
+
+            # extract the fetches
+            norm_unclipped_grads, grads, total_loss, losses = result[:4]
+            if self.middleware.__class__.__name__ == 'LSTMMiddleware':
+                (self.curr_rnn_c_in, self.curr_rnn_h_in) = result[4]
+            fetched_tensors = []
+            if len(additional_fetches) > 0:
+                fetched_tensors = result[self.additional_fetches_start_idx:self.additional_fetches_start_idx +
+                                                                      len(additional_fetches)]
+
+            # accumulate the gradients
+            for idx, grad in enumerate(grads):
+                if no_accumulation:
+                    self.accumulated_gradients[idx] = grad
+                else:
+                    self.accumulated_gradients[idx] += grad
+
+            return total_loss, losses, norm_unclipped_grads, fetched_tensors
+
+        else:
+            self.optimizer.minimize(session=self.sess, feed_dict=feed_dict)
+
+            return [0]
+
+    def create_feed_dict(self, inputs):
+        feed_dict = {}
+        for input_name, input_value in inputs.items():
+            if isinstance(input_name, str):
+                if input_name not in self.inputs:
+                    raise ValueError((
+                        'input name {input_name} was provided to create a feed '
+                        'dictionary, but there is no placeholder with that name. '
+                        'placeholder names available include: {placeholder_names}'
+                    ).format(
+                        input_name=input_name,
+                        placeholder_names=', '.join(self.inputs.keys())
+                    ))
+
+                feed_dict[self.inputs[input_name]] = input_value
+            elif isinstance(input_name, tf.Tensor) and input_name.op.type == 'Placeholder':
+                feed_dict[input_name] = input_value
+            else:
+                raise ValueError((
+                    'input dictionary expects strings or placeholders as keys, '
+                    'but found key {key} of type {type}'
+                ).format(
+                    key=input_name,
+                    type=type(input_name),
+                ))
+
+        return feed_dict
+
+    def apply_and_reset_gradients(self, gradients, scaler=1.):
+        """
+        Applies the given gradients to the network weights and resets the accumulation placeholder
+        :param gradients: The gradients to use for the update
+        :param scaler: A scaling factor that allows rescaling the gradients before applying them
+        """
+        self.apply_gradients(gradients, scaler)
+        self.reset_accumulated_gradients()
+
+    def wait_for_all_workers_to_lock(self, lock: str, include_only_training_workers: bool=False):
+        """
+        Waits for all the workers to lock a certain lock and then continues
+        :param lock: the name of the lock to use
+        :param include_only_training_workers: wait only for training workers or for all the workers?
+        :return: None
+        """
+        if include_only_training_workers:
+            num_workers_to_wait_for = self.ap.task_parameters.num_training_tasks
+        else:
+            num_workers_to_wait_for = self.ap.task_parameters.num_tasks
+
+        # lock
+        if hasattr(self, '{}_counter'.format(lock)):
+            self.sess.run(getattr(self, lock))
+            while self.sess.run(getattr(self, '{}_counter'.format(lock))) % num_workers_to_wait_for != 0:
+                time.sleep(0.00001)
+            # self.sess.run(getattr(self, '{}_init'.format(lock)))
+        else:
+            raise ValueError("no counter was defined for the lock {}".format(lock))
+
+    def wait_for_all_workers_barrier(self, include_only_training_workers: bool=False):
+        """
+        A barrier that allows waiting for all the workers to finish a certain block of commands
+        :param include_only_training_workers: wait only for training workers or for all the workers?
+        :return: None
+        """
+        self.wait_for_all_workers_to_lock('lock', include_only_training_workers=include_only_training_workers)
+        self.sess.run(self.lock_init)
+
+        # we need to lock again (on a different lock) in order to prevent a situation where one of the workers continue
+        # and then was able to first increase the lock again by one, only to have a late worker to reset it again.
+        # so we want to make sure that all workers are done resetting the lock before continuting to reuse that lock.
+
+        self.wait_for_all_workers_to_lock('release', include_only_training_workers=include_only_training_workers)
+        self.sess.run(self.release_init)
+
+    def apply_gradients(self, gradients, scaler=1.):
+        """
+        Applies the given gradients to the network weights
+        :param gradients: The gradients to use for the update
+        :param scaler: A scaling factor that allows rescaling the gradients before applying them.
+                       The gradients will be MULTIPLIED by this factor
+        """
+        if self.network_parameters.async_training or not isinstance(self.ap.task_parameters, DistributedTaskParameters):
+            if hasattr(self, 'global_step') and not self.network_is_local:
+                self.sess.run(self.inc_step)
+
+        if self.optimizer_type != 'LBFGS':
+
+            if self.distributed_training and not self.network_parameters.async_training:
+                # rescale the gradients so that they average out with the gradients from the other workers
+                if self.network_parameters.scale_down_gradients_by_number_of_workers_for_sync_training:
+                    scaler /= float(self.ap.task_parameters.num_training_tasks)
+
+            # rescale the gradients
+            if scaler != 1.:
+                for gradient in gradients:
+                    gradient *= scaler
+
+            # apply the gradients
+            feed_dict = dict(zip(self.weights_placeholders, gradients))
+            if self.distributed_training and self.network_parameters.shared_optimizer \
+                    and not self.network_parameters.async_training:
+                # synchronous distributed training with shared optimizer:
+                # - each worker adds its gradients to the shared gradients accumulators
+                # - we wait for all the workers to add their gradients
+                # - the chief worker (worker with task index = 0) applies the gradients once and resets the accumulators
+
+                self.sess.run(self.accumulate_shared_gradients, feed_dict=feed_dict)
+
+                self.wait_for_all_workers_barrier(include_only_training_workers=True)
+
+                if self.is_chief:
+                    self.sess.run(self.update_weights_from_shared_gradients)
+                    self.sess.run(self.init_shared_accumulated_gradients)
+            else:
+                # async distributed training / distributed training with independent optimizer
+                #  / non-distributed training - just apply the gradients
+                feed_dict = dict(zip(self.weights_placeholders, gradients))
+                self.sess.run(self.update_weights_from_batch_gradients, feed_dict=feed_dict)
+
+            # release barrier
+            if self.distributed_training and not self.network_parameters.async_training:
+                self.wait_for_all_workers_barrier(include_only_training_workers=True)
+
+    def predict(self, inputs, outputs=None, squeeze_output=True, initial_feed_dict=None):
+        """
+        Run a forward pass of the network using the given input
+        :param inputs: The input for the network
+        :param outputs: The output for the network, defaults to self.outputs
+        :param squeeze_output: call squeeze_list on output
+        :param initial_feed_dict: a dictionary to use as the initial feed_dict. other inputs will be added to this dict
+        :return: The network output
+
+        WARNING: must only call once per state since each call is assumed by LSTM to be a new time step.
+        """
+        feed_dict = self.create_feed_dict(inputs)
+        if initial_feed_dict:
+            feed_dict.update(initial_feed_dict)
+        if outputs is None:
+            outputs = self.outputs
+
+        if self.middleware.__class__.__name__ == 'LSTMMiddleware':
+            feed_dict[self.middleware.c_in] = self.curr_rnn_c_in
+            feed_dict[self.middleware.h_in] = self.curr_rnn_h_in
+
+            output, (self.curr_rnn_c_in, self.curr_rnn_h_in) = self.sess.run([outputs, self.middleware.state_out],
+                                                                             feed_dict=feed_dict)
+        else:
+            output = self.sess.run(outputs, feed_dict)
+
+        if squeeze_output:
+            output = squeeze_list(output)
+        return output
+
+    def train_on_batch(self, inputs, targets, scaler=1., additional_fetches=None, importance_weights=None):
+        """
+        Given a batch of examples and targets, runs a forward pass & backward pass and then applies the gradients
+        :param additional_fetches: Optional tensors to fetch during the training process
+        :param inputs: The input for the network
+        :param targets: The targets corresponding to the input batch
+        :param scaler: A scaling factor that allows rescaling the gradients before applying them
+        :param importance_weights: A coefficient for each sample in the batch, which will be used to rescale the loss
+                                   error of this sample. If it is not given, the samples losses won't be scaled
+        :return: The loss of the network
+        """
+        if additional_fetches is None:
+            additional_fetches = []
+        force_list(additional_fetches)
+        loss = self.accumulate_gradients(inputs, targets, additional_fetches=additional_fetches,
+                                         importance_weights=importance_weights)
+        self.apply_and_reset_gradients(self.accumulated_gradients, scaler)
+        return loss
+
+    def get_weights(self):
+        """
+        :return: a list of tensors containing the network weights for each layer
+        """
+        return self.weights
+
+    def set_weights(self, weights, new_rate=1.0):
+        """
+        Sets the network weights from the given list of weights tensors
+        """
+        feed_dict = {}
+        old_weights, new_weights = self.sess.run([self.get_weights(), weights])
+        for placeholder_idx, new_weight in enumerate(new_weights):
+            feed_dict[self.weights_placeholders[placeholder_idx]]\
+                = new_rate * new_weight + (1 - new_rate) * old_weights[placeholder_idx]
+        self.sess.run(self.update_weights_from_list, feed_dict)
+
+    def get_variable_value(self, variable):
+        """
+        Get the value of a variable from the graph
+        :param variable: the variable
+        :return: the value of the variable
+        """
+        return self.sess.run(variable)
+
+    def set_variable_value(self, assign_op, value, placeholder=None):
+        """
+        Updates the value of a variable.
+        This requires having an assign operation for the variable, and a placeholder which will provide the value
+        :param assign_op: an assign operation for the variable
+        :param value: a value to set the variable to
+        :param placeholder: a placeholder to hold the given value for injecting it into the variable
+        """
+        self.sess.run(assign_op, feed_dict={placeholder: value})
+
+    def reset_internal_memory(self):
+        """
+        Reset any internal memory used by the network. For example, an LSTM internal state
+        :return: None
+        """
+        # initialize LSTM hidden states
+        if self.middleware.__class__.__name__ == 'LSTMMiddleware':
+            self.curr_rnn_c_in = self.middleware.c_init
+            self.curr_rnn_h_in = self.middleware.h_init
\ No newline at end of file
diff --git a/rl_coach/architectures/tensorflow_components/distributed_tf_utils.py b/rl_coach/architectures/tensorflow_components/distributed_tf_utils.py
new file mode 100644
index 0000000..db84c9a
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/distributed_tf_utils.py
@@ -0,0 +1,102 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Tuple
+
+import tensorflow as tf
+
+
+def create_cluster_spec(parameters_server: str, workers: str) -> tf.train.ClusterSpec:
+    """
+    Creates a ClusterSpec object representing the cluster.
+    :param parameters_server: comma-separated list of hostname:port pairs to which the parameter servers are assigned
+    :param workers: comma-separated list of hostname:port pairs to which the workers are assigned
+    :return: a ClusterSpec object representing the cluster
+    """
+    # extract the parameter servers and workers from the given strings
+    ps_hosts = parameters_server.split(",")
+    worker_hosts = workers.split(",")
+
+    # Create a cluster spec from the parameter server and worker hosts
+    cluster_spec = tf.train.ClusterSpec({"ps": ps_hosts, "worker": worker_hosts})
+
+    return cluster_spec
+
+
+def create_and_start_parameters_server(cluster_spec: tf.train.ClusterSpec, config: tf.ConfigProto=None) -> None:
+    """
+    Create and start a parameter server
+    :param cluster_spec: the ClusterSpec object representing the cluster
+    :param config: the tensorflow config to use
+    :return: None
+    """
+    # create a server object for the parameter server
+    server = tf.train.Server(cluster_spec, job_name="ps", task_index=0, config=config)
+
+    # wait for the server to finish
+    server.join()
+
+
+def create_worker_server_and_device(cluster_spec: tf.train.ClusterSpec, task_index: int,
+                                    use_cpu: bool=True, config: tf.ConfigProto=None) -> Tuple[str, tf.device]:
+    """
+    Creates a worker server and a device setter used to assign the workers operations to
+    :param cluster_spec: a ClusterSpec object representing the cluster
+    :param task_index: the index of the worker task
+    :param use_cpu: if use_cpu=True, all the agent operations will be assigned to a CPU instead of a GPU
+    :param config: the tensorflow config to use
+    :return: the target string for the tf.Session and the worker device setter object
+    """
+    # Create and start a worker
+    server = tf.train.Server(cluster_spec, job_name="worker", task_index=task_index, config=config)
+
+    # Assign ops to the local worker
+    worker_device = "/job:worker/task:{}".format(task_index)
+    if use_cpu:
+        worker_device += "/cpu:0"
+    else:
+        worker_device += "/device:GPU:0"
+    device = tf.train.replica_device_setter(worker_device=worker_device, cluster=cluster_spec)
+
+    return server.target, device
+
+
+def create_monitored_session(target: tf.train.Server, task_index: int,
+                             checkpoint_dir: str, save_checkpoint_secs: int, config: tf.ConfigProto=None) -> tf.Session:
+    """
+    Create a monitored session for the worker
+    :param target: the target string for the tf.Session
+    :param task_index: the task index of the worker
+    :param checkpoint_dir: a directory path where the checkpoints will be stored
+    :param save_checkpoint_secs: number of seconds between checkpoints storing
+    :param config: the tensorflow configuration (optional)
+    :return: the session to use for the run
+    """
+    # we chose the first task to be the chief
+    is_chief = task_index == 0
+
+    # Create the monitored session
+    sess = tf.train.MonitoredTrainingSession(
+        master=target,
+        is_chief=is_chief,
+        hooks=[],
+        checkpoint_dir=checkpoint_dir,
+        save_checkpoint_secs=save_checkpoint_secs,
+        config=config
+    )
+
+    return sess
+
diff --git a/rl_coach/architectures/tensorflow_components/embedders/__init__.py b/rl_coach/architectures/tensorflow_components/embedders/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/architectures/tensorflow_components/embedders/embedder.py b/rl_coach/architectures/tensorflow_components/embedders/embedder.py
new file mode 100644
index 0000000..430f053
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/embedders/embedder.py
@@ -0,0 +1,114 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List, Union
+
+import numpy as np
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.shared_variables import SharedRunningStats
+from rl_coach.base_parameters import EmbedderScheme
+
+from rl_coach.architectures.tensorflow_components.architecture import batchnorm_activation_dropout
+from rl_coach.core_types import InputEmbedding
+
+
+class InputEmbedder(object):
+    """
+    An input embedder is the first part of the network, which takes the input from the state and produces a vector
+    embedding by passing it through a neural network. The embedder will mostly be input type dependent, and there
+    can be multiple embedders in a single network
+    """
+    def __init__(self, input_size: List[int], activation_function=tf.nn.relu,
+                 scheme: EmbedderScheme=None, batchnorm: bool=False, dropout: bool=False,
+                 name: str= "embedder", input_rescaling=1.0, input_offset=0.0, input_clipping=None):
+        self.name = name
+        self.input_size = input_size
+        self.activation_function = activation_function
+        self.batchnorm = batchnorm
+        self.dropout = dropout
+        self.dropout_rate = 0
+        self.input = None
+        self.output = None
+        self.scheme = scheme
+        self.return_type = InputEmbedding
+        self.layers = []
+        self.input_rescaling = input_rescaling
+        self.input_offset = input_offset
+        self.input_clipping = input_clipping
+
+    def __call__(self, prev_input_placeholder=None):
+        with tf.variable_scope(self.get_name()):
+            if prev_input_placeholder is None:
+                self.input = tf.placeholder("float", shape=[None] + self.input_size, name=self.get_name())
+            else:
+                self.input = prev_input_placeholder
+            self._build_module()
+
+        return self.input, self.output
+
+    def _build_module(self):
+        # NOTE: for image inputs, we expect the data format to be of type uint8, so to be memory efficient. we chose not
+        #  to implement the rescaling as an input filters.observation.observation_filter, as this would have caused the
+        #  input to the network to be float, which is 4x more expensive in memory.
+        #  thus causing each saved transition in the memory to also be 4x more pricier.
+
+        input_layer = self.input / self.input_rescaling
+        input_layer -= self.input_offset
+        # clip input using te given range
+        if self.input_clipping is not None:
+            input_layer = tf.clip_by_value(input_layer, self.input_clipping[0], self.input_clipping[1])
+
+        self.layers.append(input_layer)
+
+        # layers order is conv -> batchnorm -> activation -> dropout
+        if isinstance(self.scheme, EmbedderScheme):
+            layers_params = self.schemes[self.scheme]
+        else:
+            layers_params = self.scheme
+        for idx, layer_params in enumerate(layers_params):
+            self.layers.append(
+                layer_params(input_layer=self.layers[-1], name='{}_{}'.format(layer_params.__class__.__name__, idx))
+            )
+
+            self.layers.extend(batchnorm_activation_dropout(self.layers[-1], self.batchnorm,
+                                                            self.activation_function, self.dropout,
+                                                            self.dropout_rate, idx))
+
+        self.output = tf.contrib.layers.flatten(self.layers[-1])
+
+    @property
+    def input_size(self) -> List[int]:
+        return self._input_size
+
+    @input_size.setter
+    def input_size(self, value: Union[int, List[int]]):
+        if isinstance(value, np.ndarray) or isinstance(value, tuple):
+            value = list(value)
+        elif isinstance(value, int):
+            value = [value]
+        if not isinstance(value, list):
+            raise ValueError((
+                'input_size expected to be a list, found {value} which has type {type}'
+            ).format(value=value, type=type(value)))
+        self._input_size = value
+
+    @property
+    def schemes(self):
+        raise NotImplementedError("Inheriting embedder must define schemes matching its allowed default "
+                                  "configurations.")
+
+    def get_name(self):
+        return self.name
\ No newline at end of file
diff --git a/rl_coach/architectures/tensorflow_components/embedders/image_embedder.py b/rl_coach/architectures/tensorflow_components/embedders/image_embedder.py
new file mode 100644
index 0000000..3301180
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/embedders/image_embedder.py
@@ -0,0 +1,74 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.architecture import Conv2d
+from rl_coach.base_parameters import EmbedderScheme
+
+from rl_coach.architectures.tensorflow_components.embedders.embedder import InputEmbedder
+from rl_coach.core_types import InputImageEmbedding
+
+
+class ImageEmbedder(InputEmbedder):
+    """
+    An input embedder that performs convolutions on the input and then flattens the result.
+    The embedder is intended for image like inputs, where the channels are expected to be the last axis.
+    The embedder also allows custom rescaling of the input prior to the neural network.
+    """
+    schemes = {
+        EmbedderScheme.Empty:
+            [],
+
+        EmbedderScheme.Shallow:
+            [
+                Conv2d([32, 3, 1])
+            ],
+
+        # atari dqn
+        EmbedderScheme.Medium:
+            [
+                Conv2d([32, 8, 4]),
+                Conv2d([64, 4, 2]),
+                Conv2d([64, 3, 1])
+            ],
+
+        # carla
+        EmbedderScheme.Deep: \
+            [
+                Conv2d([32, 5, 2]),
+                Conv2d([32, 3, 1]),
+                Conv2d([64, 3, 2]),
+                Conv2d([64, 3, 1]),
+                Conv2d([128, 3, 2]),
+                Conv2d([128, 3, 1]),
+                Conv2d([256, 3, 2]),
+                Conv2d([256, 3, 1])
+            ]
+    }
+
+    def __init__(self, input_size: List[int], activation_function=tf.nn.relu,
+                 scheme: EmbedderScheme=EmbedderScheme.Medium, batchnorm: bool=False, dropout: bool=False,
+                 name: str= "embedder", input_rescaling: float=255.0, input_offset: float=0.0, input_clipping=None):
+        super().__init__(input_size, activation_function, scheme, batchnorm, dropout, name, input_rescaling,
+                         input_offset, input_clipping)
+        self.return_type = InputImageEmbedding
+        if len(input_size) != 3 and scheme != EmbedderScheme.Empty:
+            raise ValueError("Image embedders expect the input size to have 3 dimensions. The given size is: {}"
+                             .format(input_size))
+
+
diff --git a/rl_coach/architectures/tensorflow_components/embedders/vector_embedder.py b/rl_coach/architectures/tensorflow_components/embedders/vector_embedder.py
new file mode 100644
index 0000000..f0bde1f
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/embedders/vector_embedder.py
@@ -0,0 +1,64 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.base_parameters import EmbedderScheme
+
+from rl_coach.architectures.tensorflow_components.embedders.embedder import InputEmbedder
+from rl_coach.core_types import InputVectorEmbedding
+
+
+class VectorEmbedder(InputEmbedder):
+    """
+    An input embedder that is intended for inputs that can be represented as vectors.
+    The embedder flattens the input, applies several dense layers to it and returns the output.
+    """
+    schemes = {
+        EmbedderScheme.Empty:
+            [],
+
+        EmbedderScheme.Shallow:
+            [
+                Dense([128])
+            ],
+
+        # dqn
+        EmbedderScheme.Medium:
+            [
+                Dense([256])
+            ],
+
+        # carla
+        EmbedderScheme.Deep: \
+            [
+                Dense([128]),
+                Dense([128]),
+                Dense([128])
+            ]
+    }
+
+    def __init__(self, input_size: List[int], activation_function=tf.nn.relu,
+                 scheme: EmbedderScheme=EmbedderScheme.Medium, batchnorm: bool=False, dropout: bool=False,
+                 name: str= "embedder", input_rescaling: float=1.0, input_offset:float=0.0, input_clipping=None):
+        super().__init__(input_size, activation_function, scheme, batchnorm, dropout, name,
+                         input_rescaling, input_offset, input_clipping)
+
+        self.return_type = InputVectorEmbedding
+        if len(self.input_size) != 1 and scheme != EmbedderScheme.Empty:
+            raise ValueError("The input size of a vector embedder must contain only a single dimension")
diff --git a/rl_coach/architectures/tensorflow_components/general_network.py b/rl_coach/architectures/tensorflow_components/general_network.py
new file mode 100644
index 0000000..2808ede
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/general_network.py
@@ -0,0 +1,344 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+from typing import Dict
+
+import numpy as np
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.heads.head import HeadParameters
+from rl_coach.architectures.tensorflow_components.middlewares.middleware import MiddlewareParameters
+from rl_coach.base_parameters import AgentParameters, InputEmbedderParameters, EmbeddingMergerType
+from rl_coach.spaces import SpacesDefinition, PlanarMapsObservationSpace
+from rl_coach.utils import get_all_subclasses, dynamic_import_and_instantiate_module_from_params
+
+from rl_coach.architectures.tensorflow_components.architecture import TensorFlowArchitecture
+from rl_coach.core_types import PredictionType
+
+
+class GeneralTensorFlowNetwork(TensorFlowArchitecture):
+    """
+    A generalized version of all possible networks implemented using tensorflow.
+    """
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, name: str,
+                 global_network=None, network_is_local: bool=True, network_is_trainable: bool=False):
+        """
+        :param agent_parameters: the agent parameters
+        :param spaces: the spaces definition of the agent
+        :param name: the name of the network
+        :param global_network: the global network replica that is shared between all the workers
+        :param network_is_local: is the network global (shared between workers) or local (dedicated to the worker)
+        :param network_is_trainable: is the network trainable (we can apply gradients on it)
+        """
+        self.global_network = global_network
+        self.network_is_local = network_is_local
+        self.network_wrapper_name = name.split('/')[0]
+        self.network_parameters = agent_parameters.network_wrappers[self.network_wrapper_name]
+        self.num_heads_per_network = 1 if self.network_parameters.use_separate_networks_per_head else \
+            len(self.network_parameters.heads_parameters)
+        self.num_networks = 1 if not self.network_parameters.use_separate_networks_per_head else \
+            len(self.network_parameters.heads_parameters)
+
+        self.gradients_from_head_rescalers = []
+        self.gradients_from_head_rescalers_placeholders = []
+        self.update_head_rescaler_value_ops = []
+
+        self.adaptive_learning_rate_scheme = None
+        self.current_learning_rate = None
+
+        # init network modules containers
+        self.input_embedders = []
+        self.output_heads = []
+        super().__init__(agent_parameters, spaces, name, global_network,
+                         network_is_local, network_is_trainable)
+
+        def fill_return_types():
+            ret_dict = {}
+            for cls in get_all_subclasses(PredictionType):
+                ret_dict[cls] = []
+            components = self.input_embedders + [self.middleware] + self.output_heads
+            for component in components:
+                if not hasattr(component, 'return_type'):
+                    raise ValueError("{} has no return_type attribute. This should not happen.")
+                if component.return_type is not None:
+                    ret_dict[component.return_type].append(component)
+
+            return ret_dict
+
+        self.available_return_types = fill_return_types()
+
+    def predict_with_prediction_type(self, states: Dict[str, np.ndarray],
+                                     prediction_type: PredictionType) -> Dict[str, np.ndarray]:
+        """
+        Search for a component[s] which has a return_type set to the to the requested PredictionType, and get
+        predictions for it.
+
+        :param states: The input states to the network.
+        :param prediction_type: The requested PredictionType to look for in the network components
+        :return: A dictionary with predictions for all components matching the requested prediction type
+        """
+
+        ret_dict = {}
+        for component in self.available_return_types[prediction_type]:
+            ret_dict[component] = self.predict(inputs=states, outputs=component.output)
+
+        return ret_dict
+
+    @staticmethod
+    def get_activation_function(activation_function_string: str):
+        """
+        Map the activation function from a string to the tensorflow framework equivalent
+        :param activation_function_string: the type of the activation function
+        :return: the tensorflow activation function
+        """
+        activation_functions = {
+            'relu': tf.nn.relu,
+            'tanh': tf.nn.tanh,
+            'sigmoid': tf.nn.sigmoid,
+            'elu': tf.nn.elu,
+            'selu': tf.nn.selu,
+            'leaky_relu': tf.nn.leaky_relu,
+            'none': None
+        }
+        assert activation_function_string in activation_functions.keys(), \
+            "Activation function must be one of the following {}. instead it was: {}"\
+                .format(activation_functions.keys(), activation_function_string)
+        return activation_functions[activation_function_string]
+
+    def get_input_embedder(self, input_name: str, embedder_params: InputEmbedderParameters):
+        """
+        Given an input embedder parameters class, creates the input embedder and returns it
+        :param input_name: the name of the input to the embedder (used for retrieving the shape). The input should
+                           be a value within the state or the action.
+        :param embedder_params: the parameters of the class of the embedder
+        :return: the embedder instance
+        """
+        allowed_inputs = copy.copy(self.spaces.state.sub_spaces)
+        allowed_inputs["action"] = copy.copy(self.spaces.action)
+        allowed_inputs["goal"] = copy.copy(self.spaces.goal)
+
+        if input_name not in allowed_inputs.keys():
+            raise ValueError("The key for the input embedder ({}) must match one of the following keys: {}"
+                             .format(input_name, allowed_inputs.keys()))
+
+        type = "vector"
+        if isinstance(allowed_inputs[input_name], PlanarMapsObservationSpace):
+            type = "image"
+
+        embedder_path = 'rl_coach.architectures.tensorflow_components.embedders.' + embedder_params.path[type]
+        embedder_params_copy = copy.copy(embedder_params)
+        embedder_params_copy.activation_function = self.get_activation_function(embedder_params.activation_function)
+        embedder_params_copy.input_rescaling = embedder_params_copy.input_rescaling[type]
+        embedder_params_copy.input_offset = embedder_params_copy.input_offset[type]
+        embedder_params_copy.name = input_name
+        module = dynamic_import_and_instantiate_module_from_params(embedder_params_copy,
+                                                                   path=embedder_path,
+                                                                   positional_args=[allowed_inputs[input_name].shape])
+        return module
+
+    def get_middleware(self, middleware_params: MiddlewareParameters):
+        """
+        Given a middleware type, creates the middleware and returns it
+        :param middleware_params: the paramaeters of the middleware class
+        :return: the middleware instance
+        """
+        middleware_params_copy = copy.copy(middleware_params)
+        middleware_params_copy.activation_function = self.get_activation_function(middleware_params.activation_function)
+        module = dynamic_import_and_instantiate_module_from_params(middleware_params_copy)
+        return module
+
+    def get_output_head(self, head_params: HeadParameters, head_idx: int, loss_weight: float=1.):
+        """
+        Given a head type, creates the head and returns it
+        :param head_params: the parameters of the head to create
+        :param head_type: the path to the class of the head under the embedders directory or a full path to a head class.
+                          the path should be in the following structure: <module_path>:<class_path>
+        :param head_idx: the head index
+        :param loss_weight: the weight to assign for the embedders loss
+        :return: the head
+        """
+
+        head_params_copy = copy.copy(head_params)
+        head_params_copy.activation_function = self.get_activation_function(head_params_copy.activation_function)
+        return dynamic_import_and_instantiate_module_from_params(head_params_copy, extra_kwargs={
+            'agent_parameters': self.ap, 'spaces': self.spaces, 'network_name': self.network_wrapper_name,
+            'head_idx': head_idx, 'loss_weight': loss_weight, 'is_local': self.network_is_local})
+
+    def get_model(self):
+        # validate the configuration
+        if len(self.network_parameters.input_embedders_parameters) == 0:
+            raise ValueError("At least one input type should be defined")
+
+        if len(self.network_parameters.heads_parameters) == 0:
+            raise ValueError("At least one output type should be defined")
+
+        if self.network_parameters.middleware_parameters is None:
+            raise ValueError("Exactly one middleware type should be defined")
+
+        if len(self.network_parameters.loss_weights) == 0:
+            raise ValueError("At least one loss weight should be defined")
+
+        if len(self.network_parameters.heads_parameters) != len(self.network_parameters.loss_weights):
+            raise ValueError("Number of loss weights should match the number of output types")
+
+        for network_idx in range(self.num_networks):
+            with tf.variable_scope('network_{}'.format(network_idx)):
+
+                ####################
+                # Input Embeddings #
+                ####################
+
+                state_embedding = []
+                for input_name in sorted(self.network_parameters.input_embedders_parameters):
+                    input_type = self.network_parameters.input_embedders_parameters[input_name]
+                    # get the class of the input embedder
+                    input_embedder = self.get_input_embedder(input_name, input_type)
+                    self.input_embedders.append(input_embedder)
+
+                    # input placeholders are reused between networks. on the first network, store the placeholders
+                    # generated by the input_embedders in self.inputs. on the rest of the networks, pass
+                    # the existing input_placeholders into the input_embedders.
+                    if network_idx == 0:
+                        input_placeholder, embedding = input_embedder()
+                        self.inputs[input_name] = input_placeholder
+                    else:
+                        input_placeholder, embedding = input_embedder(self.inputs[input_name])
+
+                    state_embedding.append(embedding)
+
+                ##########
+                # Merger #
+                ##########
+
+                if len(state_embedding) == 1:
+                    state_embedding = state_embedding[0]
+                else:
+                    if self.network_parameters.embedding_merger_type == EmbeddingMergerType.Concat:
+                        state_embedding = tf.concat(state_embedding, axis=-1, name="merger")
+                    elif self.network_parameters.embedding_merger_type == EmbeddingMergerType.Sum:
+                        state_embedding = tf.add_n(state_embedding, name="merger")
+
+                ##############
+                # Middleware #
+                ##############
+
+                self.middleware = self.get_middleware(self.network_parameters.middleware_parameters)
+                _, self.state_embedding = self.middleware(state_embedding)
+
+                ################
+                # Output Heads #
+                ################
+
+                head_count = 0
+                for head_idx in range(self.num_heads_per_network):
+                    for head_copy_idx in range(self.network_parameters.num_output_head_copies):
+                        if self.network_parameters.use_separate_networks_per_head:
+                            # if we use separate networks per head, then the head type corresponds top the network idx
+                            head_type_idx = network_idx
+                            head_count = network_idx
+                        else:
+                            # if we use a single network with multiple embedders, then the head type is the current head idx
+                            head_type_idx = head_idx
+                        self.output_heads.append(
+                            self.get_output_head(self.network_parameters.heads_parameters[head_type_idx],
+                                                 head_copy_idx,
+                                                 self.network_parameters.loss_weights[head_type_idx])
+                        )
+
+                        # rescale the gradients from the head
+                        self.gradients_from_head_rescalers.append(
+                            tf.get_variable('gradients_from_head_{}-{}_rescalers'.format(head_idx, head_copy_idx),
+                                            initializer=float(
+                                                self.network_parameters.rescale_gradient_from_head_by_factor[head_count]
+                                            ),
+                                            dtype=tf.float32))
+
+                        self.gradients_from_head_rescalers_placeholders.append(
+                            tf.placeholder('float',
+                                           name='gradients_from_head_{}-{}_rescalers'.format(head_type_idx, head_copy_idx)))
+
+                        self.update_head_rescaler_value_ops.append(self.gradients_from_head_rescalers[head_count].assign(
+                            self.gradients_from_head_rescalers_placeholders[head_count]))
+
+                        head_input = (1-self.gradients_from_head_rescalers[head_count]) * tf.stop_gradient(self.state_embedding) + \
+                                     self.gradients_from_head_rescalers[head_count] * self.state_embedding
+
+                        # build the head
+                        if self.network_is_local:
+                            output, target_placeholder, input_placeholders, importance_weight_ph = \
+                                self.output_heads[-1](head_input)
+
+                            self.targets.extend(target_placeholder)
+                            self.importance_weights.extend(importance_weight_ph)
+                        else:
+                            output, input_placeholders = self.output_heads[-1](head_input)
+
+                        self.outputs.extend(output)
+                        # TODO: use head names as well
+                        for placeholder_index, input_placeholder in enumerate(input_placeholders):
+                            self.inputs['output_{}_{}'.format(head_type_idx, placeholder_index)] = input_placeholder
+
+                        head_count += 1
+
+        # Losses
+        self.losses = tf.losses.get_losses(self.full_name)
+        self.losses += tf.losses.get_regularization_losses(self.full_name)
+        self.total_loss = tf.losses.compute_weighted_loss(self.losses, scope=self.full_name)
+        # tf.summary.scalar('total_loss', self.total_loss)
+
+        # Learning rate
+        if self.network_parameters.learning_rate_decay_rate != 0:
+            self.adaptive_learning_rate_scheme = \
+                tf.train.exponential_decay(
+                    self.network_parameters.learning_rate,
+                    self.global_step,
+                    decay_steps=self.network_parameters.learning_rate_decay_steps,
+                    decay_rate=self.network_parameters.learning_rate_decay_rate,
+                    staircase=True)
+
+            self.current_learning_rate = self.adaptive_learning_rate_scheme
+        else:
+            self.current_learning_rate = self.network_parameters.learning_rate
+
+        # Optimizer
+        if self.distributed_training and self.network_is_local and self.network_parameters.shared_optimizer:
+            # distributed training + is a local network + optimizer shared -> take the global optimizer
+            self.optimizer = self.global_network.optimizer
+        elif (self.distributed_training and self.network_is_local and not self.network_parameters.shared_optimizer) \
+                or self.network_parameters.shared_optimizer or not self.distributed_training:
+            # distributed training + is a global network + optimizer shared
+            # OR
+            # distributed training + is a local network + optimizer not shared
+            # OR
+            # non-distributed training
+            # -> create an optimizer
+
+            if self.network_parameters.optimizer_type == 'Adam':
+                self.optimizer = tf.train.AdamOptimizer(learning_rate=self.current_learning_rate,
+                                                        beta1=self.network_parameters.adam_optimizer_beta1,
+                                                        beta2=self.network_parameters.adam_optimizer_beta2,
+                                                        epsilon=self.network_parameters.optimizer_epsilon)
+            elif self.network_parameters.optimizer_type == 'RMSProp':
+                self.optimizer = tf.train.RMSPropOptimizer(self.current_learning_rate,
+                                                           decay=self.network_parameters.rms_prop_optimizer_decay,
+                                                           epsilon=self.network_parameters.optimizer_epsilon)
+            elif self.network_parameters.optimizer_type == 'LBFGS':
+                self.optimizer = tf.contrib.opt.ScipyOptimizerInterface(self.total_loss, method='L-BFGS-B',
+                                                                        options={'maxiter': 25})
+            else:
+                raise Exception("{} is not a valid optimizer type".format(self.network_parameters.optimizer_type))
+
+
diff --git a/rl_coach/architectures/tensorflow_components/heads/__init__.py b/rl_coach/architectures/tensorflow_components/heads/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/architectures/tensorflow_components/heads/categorical_q_head.py b/rl_coach/architectures/tensorflow_components/heads/categorical_q_head.py
new file mode 100644
index 0000000..c759eb3
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/categorical_q_head.py
@@ -0,0 +1,54 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import SpacesDefinition
+
+from rl_coach.architectures.tensorflow_components.heads.head import Head, HeadParameters
+from rl_coach.core_types import QActionStateValue
+
+
+class CategoricalQHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='relu', name: str='categorical_q_head_params'):
+        super().__init__(parameterized_class=CategoricalQHead, activation_function=activation_function, name=name)
+
+
+class CategoricalQHead(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str ='relu'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'categorical_dqn_head'
+        self.num_actions = len(self.spaces.action.actions)
+        self.num_atoms = agent_parameters.algorithm.atoms
+        self.return_type = QActionStateValue
+
+    def _build_module(self, input_layer):
+        self.actions = tf.placeholder(tf.int32, [None], name="actions")
+        self.input = [self.actions]
+
+        values_distribution = tf.layers.dense(input_layer, self.num_actions * self.num_atoms, name='output')
+        values_distribution = tf.reshape(values_distribution, (tf.shape(values_distribution)[0], self.num_actions,
+                                                               self.num_atoms))
+        # softmax on atoms dimension
+        self.output = tf.nn.softmax(values_distribution)
+
+        # calculate cross entropy loss
+        self.distributions = tf.placeholder(tf.float32, shape=(None, self.num_actions, self.num_atoms),
+                                            name="distributions")
+        self.target = self.distributions
+        self.loss = tf.nn.softmax_cross_entropy_with_logits(labels=self.target, logits=values_distribution)
+        tf.losses.add_loss(self.loss)
diff --git a/rl_coach/architectures/tensorflow_components/heads/ddpg_actor_head.py b/rl_coach/architectures/tensorflow_components/heads/ddpg_actor_head.py
new file mode 100644
index 0000000..8b7c0dc
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/ddpg_actor_head.py
@@ -0,0 +1,66 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.architecture import batchnorm_activation_dropout
+
+from rl_coach.architectures.tensorflow_components.heads.head import Head, HeadParameters
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import SpacesDefinition
+
+from rl_coach.core_types import ActionProbabilities
+
+
+class DDPGActorHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='tanh', name: str='policy_head_params', batchnorm: bool=True):
+        super().__init__(parameterized_class=DDPGActor, activation_function=activation_function, name=name)
+        self.batchnorm = batchnorm
+
+
+class DDPGActor(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='tanh',
+                 batchnorm: bool=True):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'ddpg_actor_head'
+        self.return_type = ActionProbabilities
+
+        self.num_actions = self.spaces.action.shape
+
+        self.batchnorm = batchnorm
+
+        # bounded actions
+        self.output_scale = self.spaces.action.max_abs_range
+
+        # a scalar weight that penalizes high activation values (before the activation function) for the final layer
+        if hasattr(agent_parameters.algorithm, 'action_penalty'):
+            self.action_penalty = agent_parameters.algorithm.action_penalty
+
+    def _build_module(self, input_layer):
+        # mean
+        pre_activation_policy_values_mean = tf.layers.dense(input_layer, self.num_actions, name='fc_mean')
+        policy_values_mean = batchnorm_activation_dropout(pre_activation_policy_values_mean, self.batchnorm,
+                                                          self.activation_function,
+                                                          False, 0, 0)[-1]
+        self.policy_mean = tf.multiply(policy_values_mean, self.output_scale, name='output_mean')
+
+        if self.is_local:
+            # add a squared penalty on the squared pre-activation features of the action
+            if self.action_penalty and self.action_penalty != 0:
+                self.regularizations += \
+                    [self.action_penalty * tf.reduce_mean(tf.square(pre_activation_policy_values_mean))]
+
+        self.output = [self.policy_mean]
diff --git a/rl_coach/architectures/tensorflow_components/heads/dnd_q_head.py b/rl_coach/architectures/tensorflow_components/heads/dnd_q_head.py
new file mode 100644
index 0000000..affb23b
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/dnd_q_head.py
@@ -0,0 +1,87 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.heads.head import HeadParameters
+
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.architectures.tensorflow_components.heads.q_head import QHead
+from rl_coach.spaces import SpacesDefinition
+from rl_coach.memories.non_episodic import differentiable_neural_dictionary
+
+
+class DNDQHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='relu', name: str='dnd_q_head_params'):
+        super().__init__(parameterized_class=DNDQHead, activation_function=activation_function, name=name)
+
+
+class DNDQHead(QHead):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='relu'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'dnd_q_values_head'
+        self.DND_size = agent_parameters.algorithm.dnd_size
+        self.DND_key_error_threshold = agent_parameters.algorithm.DND_key_error_threshold
+        self.l2_norm_added_delta = agent_parameters.algorithm.l2_norm_added_delta
+        self.new_value_shift_coefficient = agent_parameters.algorithm.new_value_shift_coefficient
+        self.number_of_nn = agent_parameters.algorithm.number_of_knn
+        self.ap = agent_parameters
+        self.dnd_embeddings = [None] * self.num_actions
+        self.dnd_values = [None] * self.num_actions
+        self.dnd_indices = [None] * self.num_actions
+        self.dnd_distances = [None] * self.num_actions
+        if self.ap.memory.shared_memory:
+            self.shared_memory_scratchpad = self.ap.task_parameters.shared_memory_scratchpad
+
+    def _build_module(self, input_layer):
+        if hasattr(self.ap.task_parameters, 'checkpoint_restore_dir') and self.ap.task_parameters.checkpoint_restore_dir:
+            self.DND = differentiable_neural_dictionary.load_dnd(self.ap.task_parameters.checkpoint_restore_dir)
+        else:
+            self.DND = differentiable_neural_dictionary.QDND(
+                self.DND_size, input_layer.get_shape()[-1], self.num_actions, self.new_value_shift_coefficient,
+                key_error_threshold=self.DND_key_error_threshold,
+                learning_rate=self.network_parameters.learning_rate,
+                num_neighbors=self.number_of_nn,
+                override_existing_keys=True)
+
+        # Retrieve info from DND dictionary
+        # We assume that all actions have enough entries in the DND
+        self.output = tf.transpose([
+            self._q_value(input_layer, action)
+            for action in range(self.num_actions)
+        ])
+
+    def _q_value(self, input_layer, action):
+        result = tf.py_func(self.DND.query,
+                            [input_layer, action, self.number_of_nn],
+                            [tf.float64, tf.float64, tf.int64])
+        self.dnd_embeddings[action] = tf.to_float(result[0])
+        self.dnd_values[action] = tf.to_float(result[1])
+        self.dnd_indices[action] = result[2]
+
+        # DND calculation
+        square_diff = tf.square(self.dnd_embeddings[action] - tf.expand_dims(input_layer, 1))
+        distances = tf.reduce_sum(square_diff, axis=2) + [self.l2_norm_added_delta]
+        self.dnd_distances[action] = distances
+        weights = 1.0 / distances
+        normalised_weights = weights / tf.reduce_sum(weights, axis=1, keep_dims=True)
+        q_value = tf.reduce_sum(self.dnd_values[action] * normalised_weights, axis=1)
+        q_value.set_shape((None,))
+        return q_value
+
+    def _post_build(self):
+        # DND gradients
+        self.dnd_embeddings_grad = tf.gradients(self.loss[0], self.dnd_embeddings)
+        self.dnd_values_grad = tf.gradients(self.loss[0], self.dnd_values)
diff --git a/rl_coach/architectures/tensorflow_components/heads/dueling_q_head.py b/rl_coach/architectures/tensorflow_components/heads/dueling_q_head.py
new file mode 100644
index 0000000..495cae3
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/dueling_q_head.py
@@ -0,0 +1,50 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.heads.head import HeadParameters
+
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.architectures.tensorflow_components.heads.q_head import QHead
+from rl_coach.spaces import SpacesDefinition
+
+
+class DuelingQHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='relu', name: str='dueling_q_head_params'):
+        super().__init__(parameterized_class=DuelingQHead, activation_function=activation_function, name=name)
+
+
+class DuelingQHead(QHead):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='relu'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'dueling_q_values_head'
+
+    def _build_module(self, input_layer):
+        # state value tower - V
+        with tf.variable_scope("state_value"):
+            state_value = tf.layers.dense(input_layer, 512, activation=self.activation_function, name='fc1')
+            state_value = tf.layers.dense(state_value, 1, name='fc2')
+            # state_value = tf.expand_dims(state_value, axis=-1)
+
+        # action advantage tower - A
+        with tf.variable_scope("action_advantage"):
+            action_advantage = tf.layers.dense(input_layer, 512, activation=self.activation_function, name='fc1')
+            action_advantage = tf.layers.dense(action_advantage, self.num_actions, name='fc2')
+            action_advantage = action_advantage - tf.reduce_mean(action_advantage)
+
+        # merge to state-action value function Q
+        self.output = tf.add(state_value, action_advantage, name='output')
diff --git a/rl_coach/architectures/tensorflow_components/heads/head.py b/rl_coach/architectures/tensorflow_components/heads/head.py
new file mode 100644
index 0000000..891e304
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/head.py
@@ -0,0 +1,165 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+from typing import Type
+
+import numpy as np
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters, Parameters
+from rl_coach.spaces import SpacesDefinition
+from tensorflow.python.ops.losses.losses_impl import Reduction
+
+from rl_coach.utils import force_list
+
+
+# Used to initialize weights for policy and value output layers
+def normalized_columns_initializer(std=1.0):
+    def _initializer(shape, dtype=None, partition_info=None):
+        out = np.random.randn(*shape).astype(np.float32)
+        out *= std / np.sqrt(np.square(out).sum(axis=0, keepdims=True))
+        return tf.constant(out)
+    return _initializer
+
+
+class HeadParameters(Parameters):
+    def __init__(self, parameterized_class: Type['Head'], activation_function: str = 'relu', name: str= 'head'):
+        super().__init__()
+        self.activation_function = activation_function
+        self.name = name
+        self.parameterized_class_name = parameterized_class.__name__
+
+
+class Head(object):
+    """
+    A head is the final part of the network. It takes the embedding from the middleware embedder and passes it through
+    a neural network to produce the output of the network. There can be multiple heads in a network, and each one has
+    an assigned loss function. The heads are algorithm dependent.
+    """
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int=0, loss_weight: float=1., is_local: bool=True, activation_function: str='relu'):
+        self.head_idx = head_idx
+        self.network_name = network_name
+        self.network_parameters = agent_parameters.network_wrappers[self.network_name]
+        self.name = "head"
+        self.output = []
+        self.loss = []
+        self.loss_type = []
+        self.regularizations = []
+        self.loss_weight = force_list(loss_weight)
+        self.target = []
+        self.importance_weight = []
+        self.input = []
+        self.is_local = is_local
+        self.ap = agent_parameters
+        self.spaces = spaces
+        self.return_type = None
+        self.activation_function = activation_function
+
+    def __call__(self, input_layer):
+        """
+        Wrapper for building the module graph including scoping and loss creation
+        :param input_layer: the input to the graph
+        :return: the output of the last layer and the target placeholder
+        """
+        with tf.variable_scope(self.get_name(), initializer=tf.contrib.layers.xavier_initializer()):
+            self._build_module(input_layer)
+
+            self.output = force_list(self.output)
+            self.target = force_list(self.target)
+            self.input = force_list(self.input)
+            self.loss_type = force_list(self.loss_type)
+            self.loss = force_list(self.loss)
+            self.regularizations = force_list(self.regularizations)
+            if self.is_local:
+                self.set_loss()
+            self._post_build()
+
+        if self.is_local:
+            return self.output, self.target, self.input, self.importance_weight
+        else:
+            return self.output, self.input
+
+    def _build_module(self, input_layer):
+        """
+        Builds the graph of the module
+        This method is called early on from __call__. It is expected to store the graph
+        in self.output.
+        :param input_layer: the input to the graph
+        :return: None
+        """
+        pass
+
+    def _post_build(self):
+        """
+        Optional function that allows adding any extra definitions after the head has been fully defined
+        For example, this allows doing additional calculations that are based on the loss
+        :return: None
+        """
+        pass
+
+    def get_name(self):
+        """
+        Get a formatted name for the module
+        :return: the formatted name
+        """
+        return '{}_{}'.format(self.name, self.head_idx)
+
+    def set_loss(self):
+        """
+        Creates a target placeholder and loss function for each loss_type and regularization
+        :param loss_type: a tensorflow loss function
+        :param scope: the name scope to include the tensors in
+        :return: None
+        """
+
+        # there are heads that define the loss internally, but we need to create additional placeholders for them
+        for idx in range(len(self.loss)):
+            importance_weight = tf.placeholder('float',
+                                               [None] + [1] * (len(self.target[idx].shape) - 1),
+                                               '{}_importance_weight'.format(self.get_name()))
+            self.importance_weight.append(importance_weight)
+
+        # add losses and target placeholder
+        for idx in range(len(self.loss_type)):
+            # create target placeholder
+            target = tf.placeholder('float', self.output[idx].shape, '{}_target'.format(self.get_name()))
+            self.target.append(target)
+
+            # create importance sampling weights placeholder
+            num_target_dims = len(self.target[idx].shape)
+            importance_weight = tf.placeholder('float', [None] + [1] * (num_target_dims - 1),
+                                               '{}_importance_weight'.format(self.get_name()))
+            self.importance_weight.append(importance_weight)
+
+            # compute the weighted loss. importance_weight weights over the samples in the batch, while self.loss_weight
+            # weights the specific loss of this head against other losses in this head or in other heads
+            loss_weight = self.loss_weight[idx]*importance_weight
+            loss = self.loss_type[idx](self.target[-1], self.output[idx],
+                                       scope=self.get_name(), reduction=Reduction.NONE, loss_collection=None)
+
+            # the loss is first summed over each sample in the batch and then the mean over the batch is taken
+            loss = tf.reduce_mean(loss_weight*tf.reduce_sum(loss, axis=list(range(1, num_target_dims))))
+
+            # we add the loss to the losses collection and later we will extract it in general_network
+            tf.losses.add_loss(loss)
+            self.loss.append(loss)
+
+        # add regularizations
+        for regularization in self.regularizations:
+            self.loss.append(regularization)
+
+    @classmethod
+    def path(cls):
+        return cls.__class__.__name__
diff --git a/rl_coach/architectures/tensorflow_components/heads/measurements_prediction_head.py b/rl_coach/architectures/tensorflow_components/heads/measurements_prediction_head.py
new file mode 100644
index 0000000..e8b9cc5
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/measurements_prediction_head.py
@@ -0,0 +1,65 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import SpacesDefinition
+
+from rl_coach.architectures.tensorflow_components.heads.head import Head, HeadParameters
+from rl_coach.core_types import Measurements
+
+
+class MeasurementsPredictionHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='relu', name: str='measurements_prediction_head_params'):
+        super().__init__(parameterized_class=MeasurementsPredictionHead,
+                         activation_function=activation_function, name=name)
+
+
+class MeasurementsPredictionHead(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='relu'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'future_measurements_head'
+        self.num_actions = len(self.spaces.action.actions)
+        self.num_measurements = self.spaces.state['measurements'].shape[0]
+        self.num_prediction_steps = agent_parameters.algorithm.num_predicted_steps_ahead
+        self.multi_step_measurements_size = self.num_measurements * self.num_prediction_steps
+        self.return_type = Measurements
+
+    def _build_module(self, input_layer):
+        # This is almost exactly the same as Dueling Network but we predict the future measurements for each action
+        # actions expectation tower (expectation stream) - E
+        with tf.variable_scope("expectation_stream"):
+            expectation_stream = tf.layers.dense(input_layer, 256, activation=self.activation_function, name='fc1')
+            expectation_stream = tf.layers.dense(expectation_stream, self.multi_step_measurements_size, name='output')
+            expectation_stream = tf.expand_dims(expectation_stream, axis=1)
+
+        # action fine differences tower (action stream) - A
+        with tf.variable_scope("action_stream"):
+            action_stream = tf.layers.dense(input_layer, 256, activation=self.activation_function, name='fc1')
+            action_stream = tf.layers.dense(action_stream, self.num_actions * self.multi_step_measurements_size,
+                                            name='output')
+            action_stream = tf.reshape(action_stream,
+                                       (tf.shape(action_stream)[0], self.num_actions, self.multi_step_measurements_size))
+            action_stream = action_stream - tf.reduce_mean(action_stream, reduction_indices=1, keepdims=True)
+
+        # merge to future measurements predictions
+        self.output = tf.add(expectation_stream, action_stream, name='output')
+        self.target = tf.placeholder(tf.float32, [None, self.num_actions, self.multi_step_measurements_size],
+                                     name="targets")
+        targets_nonan = tf.where(tf.is_nan(self.target), self.output, self.target)
+        self.loss = tf.reduce_sum(tf.reduce_mean(tf.square(targets_nonan - self.output), reduction_indices=0))
+        tf.losses.add_loss(self.loss_weight[0] * self.loss)
diff --git a/rl_coach/architectures/tensorflow_components/heads/naf_head.py b/rl_coach/architectures/tensorflow_components/heads/naf_head.py
new file mode 100644
index 0000000..4e5c6b5
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/naf_head.py
@@ -0,0 +1,88 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import BoxActionSpace
+from rl_coach.spaces import SpacesDefinition
+
+from rl_coach.architectures.tensorflow_components.heads.head import Head, HeadParameters
+from rl_coach.core_types import QActionStateValue
+
+
+class NAFHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='tanh', name: str='naf_head_params'):
+        super().__init__(parameterized_class=NAFHead, activation_function=activation_function, name=name)
+
+
+class NAFHead(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True,activation_function: str='relu'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        if not isinstance(self.spaces.action, BoxActionSpace):
+            raise ValueError("NAF works only for continuous action spaces (BoxActionSpace)")
+
+        self.name = 'naf_q_values_head'
+        self.num_actions = self.spaces.action.shape[0]
+        self.output_scale = self.spaces.action.max_abs_range
+        self.return_type = QActionStateValue
+        if agent_parameters.network_wrappers[self.network_name].replace_mse_with_huber_loss:
+            self.loss_type = tf.losses.huber_loss
+        else:
+            self.loss_type = tf.losses.mean_squared_error
+
+    def _build_module(self, input_layer):
+        # NAF
+        self.action = tf.placeholder(tf.float32, [None, self.num_actions], name="action")
+        self.input = self.action
+
+        # V Head
+        self.V = tf.layers.dense(input_layer, 1, name='V')
+
+        # mu Head
+        mu_unscaled = tf.layers.dense(input_layer, self.num_actions, activation=self.activation_function, name='mu_unscaled')
+        self.mu = tf.multiply(mu_unscaled, self.output_scale, name='mu')
+
+        # A Head
+        # l_vector is a vector that includes a lower-triangular matrix values
+        self.l_vector = tf.layers.dense(input_layer, (self.num_actions * (self.num_actions + 1)) / 2, name='l_vector')
+
+        # Convert l to a lower triangular matrix and exponentiate its diagonal
+
+        i = 0
+        columns = []
+        for col in range(self.num_actions):
+            start_row = col
+            num_non_zero_elements = self.num_actions - start_row
+            zeros_column_part = tf.zeros_like(self.l_vector[:, 0:start_row])
+            diag_element = tf.expand_dims(tf.exp(self.l_vector[:, i]), 1)
+            non_zeros_non_diag_column_part = self.l_vector[:, (i + 1):(i + num_non_zero_elements)]
+            columns.append(tf.concat([zeros_column_part, diag_element, non_zeros_non_diag_column_part], axis=1))
+            i += num_non_zero_elements
+        self.L = tf.transpose(tf.stack(columns, axis=1), (0, 2, 1))
+
+        # P = L*L^T
+        self.P = tf.matmul(self.L, tf.transpose(self.L, (0, 2, 1)))
+
+        # A = -1/2 * (u - mu)^T * P * (u - mu)
+        action_diff = tf.expand_dims(self.action - self.mu, -1)
+        a_matrix_form = -0.5 * tf.matmul(tf.transpose(action_diff, (0, 2, 1)), tf.matmul(self.P, action_diff))
+        self.A = tf.reshape(a_matrix_form, [-1, 1])
+
+        # Q Head
+        self.Q = tf.add(self.V, self.A, name='Q')
+
+        self.output = self.Q
diff --git a/rl_coach/architectures/tensorflow_components/heads/policy_head.py b/rl_coach/architectures/tensorflow_components/heads/policy_head.py
new file mode 100644
index 0000000..106ae8e
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/policy_head.py
@@ -0,0 +1,151 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import numpy as np
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.heads.head import Head, normalized_columns_initializer, HeadParameters
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import DiscreteActionSpace, BoxActionSpace, CompoundActionSpace
+from rl_coach.spaces import SpacesDefinition
+from rl_coach.utils import eps
+
+from rl_coach.core_types import ActionProbabilities
+from rl_coach.exploration_policies.continuous_entropy import ContinuousEntropyParameters
+
+
+class PolicyHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='tanh', name: str='policy_head_params'):
+        super().__init__(parameterized_class=PolicyHead, activation_function=activation_function, name=name)
+
+
+class PolicyHead(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='tanh'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'policy_values_head'
+        self.return_type = ActionProbabilities
+        self.beta = None
+        self.action_penalty = None
+
+        self.exploration_policy = agent_parameters.exploration
+
+        # a scalar weight that penalizes low entropy values to encourage exploration
+        if hasattr(agent_parameters.algorithm, 'beta_entropy'):
+            self.beta = agent_parameters.algorithm.beta_entropy
+
+        # a scalar weight that penalizes high activation values (before the activation function) for the final layer
+        if hasattr(agent_parameters.algorithm, 'action_penalty'):
+            self.action_penalty = agent_parameters.algorithm.action_penalty
+
+    def _build_module(self, input_layer):
+        self.actions = []
+        self.input = self.actions
+        self.policy_distributions = []
+        self.output = []
+
+        action_spaces = [self.spaces.action]
+        if isinstance(self.spaces.action, CompoundActionSpace):
+            action_spaces = self.spaces.action.sub_action_spaces
+
+        # create a compound action network
+        for action_space_idx, action_space in enumerate(action_spaces):
+            with tf.variable_scope("sub_action_{}".format(action_space_idx)):
+                if isinstance(action_space, DiscreteActionSpace):
+                    # create a discrete action network (softmax probabilities output)
+                    self._build_discrete_net(input_layer, action_space)
+                elif isinstance(action_space, BoxActionSpace):
+                    # create a continuous action network (bounded mean and stdev outputs)
+                    self._build_continuous_net(input_layer, action_space)
+
+        if self.is_local:
+            # add entropy regularization
+            if self.beta:
+                self.entropy = tf.add_n([tf.reduce_mean(dist.entropy()) for dist in self.policy_distributions])
+                self.regularizations += [-tf.multiply(self.beta, self.entropy, name='entropy_regularization')]
+
+            tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, self.regularizations)
+
+            # calculate loss
+            self.action_log_probs_wrt_policy = \
+                tf.add_n([dist.log_prob(action) for dist, action in zip(self.policy_distributions, self.actions)])
+            self.advantages = tf.placeholder(tf.float32, [None], name="advantages")
+            self.target = self.advantages
+            self.loss = -tf.reduce_mean(self.action_log_probs_wrt_policy * self.advantages)
+            tf.losses.add_loss(self.loss_weight[0] * self.loss)
+
+    def _build_discrete_net(self, input_layer, action_space):
+        num_actions = len(action_space.actions)
+        self.actions.append(tf.placeholder(tf.int32, [None], name="actions"))
+
+        policy_values = tf.layers.dense(input_layer, num_actions, name='fc')
+        self.policy_probs = tf.nn.softmax(policy_values, name="policy")
+
+        # define the distributions for the policy and the old policy
+        # (the + eps is to prevent probability 0 which will cause the log later on to be -inf)
+        policy_distribution = tf.contrib.distributions.Categorical(probs=(self.policy_probs + eps))
+        self.policy_distributions.append(policy_distribution)
+        self.output.append(self.policy_probs)
+
+    def _build_continuous_net(self, input_layer, action_space):
+        num_actions = action_space.shape
+        self.actions.append(tf.placeholder(tf.float32, [None, num_actions], name="actions"))
+
+        # output activation function
+        if np.all(self.spaces.action.max_abs_range < np.inf):
+            # bounded actions
+            self.output_scale = action_space.max_abs_range
+            self.continuous_output_activation = self.activation_function
+        else:
+            # unbounded actions
+            self.output_scale = 1
+            self.continuous_output_activation = None
+
+        # mean
+        pre_activation_policy_values_mean = tf.layers.dense(input_layer, num_actions, name='fc_mean')
+        policy_values_mean = self.continuous_output_activation(pre_activation_policy_values_mean)
+        self.policy_mean = tf.multiply(policy_values_mean, self.output_scale, name='output_mean')
+
+        self.output.append(self.policy_mean)
+
+        # standard deviation
+        if isinstance(self.exploration_policy, ContinuousEntropyParameters):
+            # the stdev is an output of the network and uses a softplus activation as defined in A3C
+            policy_values_std = tf.layers.dense(input_layer, num_actions,
+                                                kernel_initializer=normalized_columns_initializer(0.01), name='fc_std')
+            self.policy_std = tf.nn.softplus(policy_values_std, name='output_variance') + eps
+
+            self.output.append(self.policy_std)
+        else:
+            # the stdev is an externally given value
+            # Warning: we need to explicitly put this variable in the local variables collections, since defining
+            # it as not trainable puts it for some reason in the global variables collections. If this is not done,
+            # the variable won't be initialized and when working with multiple workers they will get stuck.
+            self.policy_std = tf.Variable(np.ones(num_actions), dtype='float32', trainable=False,
+                                          name='policy_stdev', collections=[tf.GraphKeys.LOCAL_VARIABLES])
+
+            # assign op for the policy std
+            self.policy_std_placeholder = tf.placeholder('float32', (num_actions,))
+            self.assign_policy_std = tf.assign(self.policy_std, self.policy_std_placeholder)
+
+        # define the distributions for the policy and the old policy
+        policy_distribution = tf.contrib.distributions.MultivariateNormalDiag(self.policy_mean, self.policy_std)
+        self.policy_distributions.append(policy_distribution)
+
+        if self.is_local:
+            # add a squared penalty on the squared pre-activation features of the action
+            if self.action_penalty and self.action_penalty != 0:
+                self.regularizations += [
+                    self.action_penalty * tf.reduce_mean(tf.square(pre_activation_policy_values_mean))]
diff --git a/rl_coach/architectures/tensorflow_components/heads/ppo_head.py b/rl_coach/architectures/tensorflow_components/heads/ppo_head.py
new file mode 100644
index 0000000..755ffa6
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/ppo_head.py
@@ -0,0 +1,144 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import numpy as np
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import BoxActionSpace, DiscreteActionSpace
+from rl_coach.spaces import SpacesDefinition
+from rl_coach.utils import eps
+
+from rl_coach.architectures.tensorflow_components.heads.head import Head, HeadParameters, normalized_columns_initializer
+from rl_coach.core_types import ActionProbabilities
+
+
+class PPOHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='tanh', name: str='ppo_head_params'):
+        super().__init__(parameterized_class=PPOHead, activation_function=activation_function, name=name)
+
+
+class PPOHead(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='tanh'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'ppo_head'
+        self.return_type = ActionProbabilities
+
+        # used in regular PPO
+        self.use_kl_regularization = agent_parameters.algorithm.use_kl_regularization
+        if self.use_kl_regularization:
+            # kl coefficient and its corresponding assignment operation and placeholder
+            self.kl_coefficient = tf.Variable(agent_parameters.algorithm.initial_kl_coefficient,
+                                              trainable=False, name='kl_coefficient')
+            self.kl_coefficient_ph = tf.placeholder('float', name='kl_coefficient_ph')
+            self.assign_kl_coefficient = tf.assign(self.kl_coefficient, self.kl_coefficient_ph)
+            self.kl_cutoff = 2 * agent_parameters.algorithm.target_kl_divergence
+            self.high_kl_penalty_coefficient = agent_parameters.algorithm.high_kl_penalty_coefficient
+
+        self.clip_likelihood_ratio_using_epsilon = agent_parameters.algorithm.clip_likelihood_ratio_using_epsilon
+        self.beta = agent_parameters.algorithm.beta_entropy
+
+    def _build_module(self, input_layer):
+        if isinstance(self.spaces.action, DiscreteActionSpace):
+            self._build_discrete_net(input_layer, self.spaces.action)
+        elif isinstance(self.spaces.action, BoxActionSpace):
+            self._build_continuous_net(input_layer, self.spaces.action)
+        else:
+            raise ValueError("only discrete or continuous action spaces are supported for PPO")
+
+        self.action_probs_wrt_policy = self.policy_distribution.log_prob(self.actions)
+        self.action_probs_wrt_old_policy = self.old_policy_distribution.log_prob(self.actions)
+        self.entropy = tf.reduce_mean(self.policy_distribution.entropy())
+
+        # Used by regular PPO only
+        # add kl divergence regularization
+        self.kl_divergence = tf.reduce_mean(tf.distributions.kl_divergence(self.old_policy_distribution, self.policy_distribution))
+
+        if self.use_kl_regularization:
+            # no clipping => use kl regularization
+            self.weighted_kl_divergence = tf.multiply(self.kl_coefficient, self.kl_divergence)
+            self.regularizations = self.weighted_kl_divergence + self.high_kl_penalty_coefficient * \
+                                                tf.square(tf.maximum(0.0, self.kl_divergence - self.kl_cutoff))
+            tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, self.regularizations)
+
+        # calculate surrogate loss
+        self.advantages = tf.placeholder(tf.float32, [None], name="advantages")
+        self.target = self.advantages
+        # action_probs_wrt_old_policy != 0 because it is e^...
+        self.likelihood_ratio = tf.exp(self.action_probs_wrt_policy - self.action_probs_wrt_old_policy)
+        if self.clip_likelihood_ratio_using_epsilon is not None:
+            self.clip_param_rescaler = tf.placeholder(tf.float32, ())
+            self.input.append(self.clip_param_rescaler)
+            max_value = 1 + self.clip_likelihood_ratio_using_epsilon * self.clip_param_rescaler
+            min_value = 1 - self.clip_likelihood_ratio_using_epsilon * self.clip_param_rescaler
+            self.clipped_likelihood_ratio = tf.clip_by_value(self.likelihood_ratio, min_value, max_value)
+            self.scaled_advantages = tf.minimum(self.likelihood_ratio * self.advantages,
+                                                self.clipped_likelihood_ratio * self.advantages)
+        else:
+            self.scaled_advantages = self.likelihood_ratio * self.advantages
+        # minus sign is in order to set an objective to minimize (we actually strive for maximizing the surrogate loss)
+        self.surrogate_loss = -tf.reduce_mean(self.scaled_advantages)
+        if self.is_local:
+            # add entropy regularization
+            if self.beta:
+                self.entropy = tf.reduce_mean(self.policy_distribution.entropy())
+                self.regularizations = -tf.multiply(self.beta, self.entropy, name='entropy_regularization')
+                tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, self.regularizations)
+
+        self.loss = self.surrogate_loss
+        tf.losses.add_loss(self.loss)
+
+    def _build_discrete_net(self, input_layer, action_space):
+        num_actions = len(action_space.actions)
+        self.actions = tf.placeholder(tf.int32, [None], name="actions")
+
+        self.old_policy_mean = tf.placeholder(tf.float32, [None, num_actions], "old_policy_mean")
+        self.old_policy_std = tf.placeholder(tf.float32, [None, num_actions], "old_policy_std")
+
+        # Policy Head
+        self.input = [self.actions, self.old_policy_mean]
+        policy_values = tf.layers.dense(input_layer, num_actions, name='policy_fc')
+        self.policy_mean = tf.nn.softmax(policy_values, name="policy")
+
+        # define the distributions for the policy and the old policy
+        self.policy_distribution = tf.contrib.distributions.Categorical(probs=self.policy_mean)
+        self.old_policy_distribution = tf.contrib.distributions.Categorical(probs=self.old_policy_mean)
+
+        self.output = self.policy_mean
+
+    def _build_continuous_net(self, input_layer, action_space):
+        num_actions = action_space.shape[0]
+        self.actions = tf.placeholder(tf.float32, [None, num_actions], name="actions")
+
+        self.old_policy_mean = tf.placeholder(tf.float32, [None, num_actions], "old_policy_mean")
+        self.old_policy_std = tf.placeholder(tf.float32, [None, num_actions], "old_policy_std")
+
+        self.input = [self.actions, self.old_policy_mean, self.old_policy_std]
+        self.policy_mean = tf.layers.dense(input_layer, num_actions, name='policy_mean',
+                                           kernel_initializer=normalized_columns_initializer(0.01))
+        if self.is_local:
+            self.policy_logstd = tf.Variable(np.zeros((1, num_actions)), dtype='float32',
+                                            collections=[tf.GraphKeys.LOCAL_VARIABLES])
+        else:
+            self.policy_logstd = tf.Variable(np.zeros((1, num_actions)), dtype='float32')
+
+        self.policy_std = tf.tile(tf.exp(self.policy_logstd), [tf.shape(input_layer)[0], 1], name='policy_std')
+
+        # define the distributions for the policy and the old policy
+        self.policy_distribution = tf.contrib.distributions.MultivariateNormalDiag(self.policy_mean, self.policy_std + eps)
+        self.old_policy_distribution = tf.contrib.distributions.MultivariateNormalDiag(self.old_policy_mean, self.old_policy_std + eps)
+
+        self.output = [self.policy_mean, self.policy_std]
diff --git a/rl_coach/architectures/tensorflow_components/heads/ppo_v_head.py b/rl_coach/architectures/tensorflow_components/heads/ppo_v_head.py
new file mode 100644
index 0000000..0e04edd
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/ppo_v_head.py
@@ -0,0 +1,52 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import SpacesDefinition
+
+from rl_coach.architectures.tensorflow_components.heads.head import Head, normalized_columns_initializer, HeadParameters
+from rl_coach.core_types import ActionProbabilities
+
+
+class PPOVHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='relu', name: str='ppo_v_head_params'):
+        super().__init__(parameterized_class=PPOVHead, activation_function=activation_function, name=name)
+
+
+class PPOVHead(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='relu'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'ppo_v_head'
+        self.clip_likelihood_ratio_using_epsilon = agent_parameters.algorithm.clip_likelihood_ratio_using_epsilon
+        self.return_type = ActionProbabilities
+
+    def _build_module(self, input_layer):
+        self.old_policy_value = tf.placeholder(tf.float32, [None], "old_policy_values")
+        self.input = [self.old_policy_value]
+        self.output = tf.layers.dense(input_layer, 1, name='output',
+                                            kernel_initializer=normalized_columns_initializer(1.0))
+        self.target = self.total_return = tf.placeholder(tf.float32, [None], name="total_return")
+
+        value_loss_1 = tf.square(self.output - self.target)
+        value_loss_2 = tf.square(self.old_policy_value +
+                                 tf.clip_by_value(self.output - self.old_policy_value,
+                                                  -self.clip_likelihood_ratio_using_epsilon,
+                                                  self.clip_likelihood_ratio_using_epsilon) - self.target)
+        self.vf_loss = tf.reduce_mean(tf.maximum(value_loss_1, value_loss_2))
+        self.loss = self.vf_loss
+        tf.losses.add_loss(self.loss)
diff --git a/rl_coach/architectures/tensorflow_components/heads/q_head.py b/rl_coach/architectures/tensorflow_components/heads/q_head.py
new file mode 100644
index 0000000..41a697e
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/q_head.py
@@ -0,0 +1,50 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import SpacesDefinition, BoxActionSpace, DiscreteActionSpace
+
+from rl_coach.architectures.tensorflow_components.heads.head import Head, HeadParameters
+from rl_coach.core_types import QActionStateValue
+
+
+class QHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='relu', name: str='q_head_params'):
+        super().__init__(parameterized_class=QHead, activation_function=activation_function, name=name)
+
+
+class QHead(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='relu'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'q_values_head'
+        if isinstance(self.spaces.action, BoxActionSpace):
+            self.num_actions = 1
+        elif isinstance(self.spaces.action, DiscreteActionSpace):
+            self.num_actions = len(self.spaces.action.actions)
+        self.return_type = QActionStateValue
+        if agent_parameters.network_wrappers[self.network_name].replace_mse_with_huber_loss:
+            self.loss_type = tf.losses.huber_loss
+        else:
+            self.loss_type = tf.losses.mean_squared_error
+
+    def _build_module(self, input_layer):
+        # Standard Q Network
+        self.output = tf.layers.dense(input_layer, self.num_actions, name='output')
+
+
+
diff --git a/rl_coach/architectures/tensorflow_components/heads/quantile_regression_q_head.py b/rl_coach/architectures/tensorflow_components/heads/quantile_regression_q_head.py
new file mode 100644
index 0000000..86ed3b6
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/quantile_regression_q_head.py
@@ -0,0 +1,76 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import SpacesDefinition
+
+from rl_coach.architectures.tensorflow_components.heads.head import Head, HeadParameters
+from rl_coach.core_types import QActionStateValue
+
+
+class QuantileRegressionQHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='relu', name: str='quantile_regression_q_head_params'):
+        super().__init__(parameterized_class=QuantileRegressionQHead, activation_function=activation_function,
+                         name=name)
+
+
+class QuantileRegressionQHead(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='relu'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'quantile_regression_dqn_head'
+        self.num_actions = len(self.spaces.action.actions)
+        self.num_atoms = agent_parameters.algorithm.atoms  # we use atom / quantile interchangeably
+        self.huber_loss_interval = agent_parameters.algorithm.huber_loss_interval  # k
+        self.return_type = QActionStateValue
+
+    def _build_module(self, input_layer):
+        self.actions = tf.placeholder(tf.int32, [None, 2], name="actions")
+        self.quantile_midpoints = tf.placeholder(tf.float32, [None, self.num_atoms], name="quantile_midpoints")
+        self.input = [self.actions, self.quantile_midpoints]
+
+        # the output of the head is the N unordered quantile locations {theta_1, ..., theta_N}
+        quantiles_locations = tf.layers.dense(input_layer, self.num_actions * self.num_atoms, name='output')
+        quantiles_locations = tf.reshape(quantiles_locations, (tf.shape(quantiles_locations)[0], self.num_actions, self.num_atoms))
+        self.output = quantiles_locations
+
+        self.quantiles = tf.placeholder(tf.float32, shape=(None, self.num_atoms), name="quantiles")
+        self.target = self.quantiles
+
+        # only the quantiles of the taken action are taken into account
+        quantiles_for_used_actions = tf.gather_nd(quantiles_locations, self.actions)
+
+        # reorder the output quantiles and the target quantiles as a preparation step for calculating the loss
+        # the output quantiles vector and the quantile midpoints are tiled as rows of a NxN matrix (N = num quantiles)
+        # the target quantiles vector is tiled as column of a NxN matrix
+        theta_i = tf.tile(tf.expand_dims(quantiles_for_used_actions, -1), [1, 1, self.num_atoms])
+        T_theta_j = tf.tile(tf.expand_dims(self.target, -2), [1, self.num_atoms, 1])
+        tau_i = tf.tile(tf.expand_dims(self.quantile_midpoints, -1), [1, 1, self.num_atoms])
+
+        # Huber loss of T(theta_j) - theta_i
+        error = T_theta_j - theta_i
+        abs_error = tf.abs(error)
+        quadratic = tf.minimum(abs_error, self.huber_loss_interval)
+        huber_loss = self.huber_loss_interval * (abs_error - quadratic) + 0.5 * quadratic ** 2
+
+        # Quantile Huber loss
+        quantile_huber_loss = tf.abs(tau_i - tf.cast(error < 0, dtype=tf.float32)) * huber_loss
+
+        # Quantile regression loss (the probability for each quantile is 1/num_quantiles)
+        quantile_regression_loss = tf.reduce_sum(quantile_huber_loss) / float(self.num_atoms)
+        self.loss = quantile_regression_loss
+        tf.losses.add_loss(self.loss)
diff --git a/rl_coach/architectures/tensorflow_components/heads/v_head.py b/rl_coach/architectures/tensorflow_components/heads/v_head.py
new file mode 100644
index 0000000..458447c
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/heads/v_head.py
@@ -0,0 +1,45 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import tensorflow as tf
+from rl_coach.base_parameters import AgentParameters
+from rl_coach.spaces import SpacesDefinition
+
+from rl_coach.architectures.tensorflow_components.heads.head import Head, normalized_columns_initializer, HeadParameters
+from rl_coach.core_types import VStateValue
+
+
+class VHeadParameters(HeadParameters):
+    def __init__(self, activation_function: str ='relu', name: str='v_head_params'):
+        super().__init__(parameterized_class=VHead, activation_function=activation_function, name=name)
+
+
+class VHead(Head):
+    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,
+                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str='relu'):
+        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)
+        self.name = 'v_values_head'
+        self.return_type = VStateValue
+
+        if agent_parameters.network_wrappers[self.network_name.split('/')[0]].replace_mse_with_huber_loss:
+            self.loss_type = tf.losses.huber_loss
+        else:
+            self.loss_type = tf.losses.mean_squared_error
+
+    def _build_module(self, input_layer):
+        # Standard V Network
+        self.output = tf.layers.dense(input_layer, 1, name='output',
+                                      kernel_initializer=normalized_columns_initializer(1.0))
diff --git a/rl_coach/architectures/tensorflow_components/middlewares/__init__.py b/rl_coach/architectures/tensorflow_components/middlewares/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/architectures/tensorflow_components/middlewares/fc_middleware.py b/rl_coach/architectures/tensorflow_components/middlewares/fc_middleware.py
new file mode 100644
index 0000000..a6c9b79
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/middlewares/fc_middleware.py
@@ -0,0 +1,86 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+from typing import Union, List
+
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.middlewares.middleware import Middleware, MiddlewareParameters
+from rl_coach.base_parameters import MiddlewareScheme
+
+from rl_coach.architectures.tensorflow_components.architecture import batchnorm_activation_dropout, Dense
+from rl_coach.core_types import Middleware_FC_Embedding
+
+
+class FCMiddlewareParameters(MiddlewareParameters):
+    def __init__(self, activation_function='relu',
+                 scheme: Union[List, MiddlewareScheme] = MiddlewareScheme.Medium,
+                 batchnorm: bool = False, dropout: bool = False,
+                 name="middleware_fc_embedder"):
+        super().__init__(parameterized_class=FCMiddleware, activation_function=activation_function,
+                         scheme=scheme, batchnorm=batchnorm, dropout=dropout, name=name)
+
+
+class FCMiddleware(Middleware):
+    schemes = {
+        MiddlewareScheme.Empty:
+            [],
+
+        # ppo
+        MiddlewareScheme.Shallow:
+            [
+                Dense([64])
+            ],
+
+        # dqn
+        MiddlewareScheme.Medium:
+            [
+                Dense([512])
+            ],
+
+        MiddlewareScheme.Deep: \
+            [
+                Dense([128]),
+                Dense([128]),
+                Dense([128])
+            ]
+    }
+
+    def __init__(self, activation_function=tf.nn.relu,
+                 scheme: MiddlewareScheme = MiddlewareScheme.Medium,
+                 batchnorm: bool = False, dropout: bool = False,
+                 name="middleware_fc_embedder"):
+        super().__init__(activation_function=activation_function, batchnorm=batchnorm,
+                         dropout=dropout, scheme=scheme, name=name)
+        self.return_type = Middleware_FC_Embedding
+        self.layers = []
+
+    def _build_module(self):
+        self.layers.append(self.input)
+
+        if isinstance(self.scheme, MiddlewareScheme):
+            layers_params = FCMiddleware.schemes[self.scheme]
+        else:
+            layers_params = self.scheme
+        for idx, layer_params in enumerate(layers_params):
+            self.layers.append(
+                layer_params(self.layers[-1], name='{}_{}'.format(layer_params.__class__.__name__, idx))
+            )
+
+            self.layers.extend(batchnorm_activation_dropout(self.layers[-1], self.batchnorm,
+                                                            self.activation_function, self.dropout,
+                                                            self.dropout_rate, idx))
+
+        self.output = self.layers[-1]
+
diff --git a/rl_coach/architectures/tensorflow_components/middlewares/lstm_middleware.py b/rl_coach/architectures/tensorflow_components/middlewares/lstm_middleware.py
new file mode 100644
index 0000000..9078a64
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/middlewares/lstm_middleware.py
@@ -0,0 +1,113 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+
+import numpy as np
+import tensorflow as tf
+from rl_coach.architectures.tensorflow_components.middlewares.middleware import Middleware, MiddlewareParameters
+from rl_coach.base_parameters import MiddlewareScheme
+
+from rl_coach.architectures.tensorflow_components.architecture import batchnorm_activation_dropout
+from rl_coach.core_types import Middleware_LSTM_Embedding
+
+
+class LSTMMiddlewareParameters(MiddlewareParameters):
+    def __init__(self, activation_function='relu', number_of_lstm_cells=256,
+                 scheme: MiddlewareScheme = MiddlewareScheme.Medium,
+                 batchnorm: bool = False, dropout: bool = False,
+                 name="middleware_lstm_embedder"):
+        super().__init__(parameterized_class=LSTMMiddleware, activation_function=activation_function,
+                         scheme=scheme, batchnorm=batchnorm, dropout=dropout, name=name)
+        self.number_of_lstm_cells = number_of_lstm_cells
+
+
+class LSTMMiddleware(Middleware):
+    schemes = {
+        MiddlewareScheme.Empty:
+            [],
+
+        # ppo
+        MiddlewareScheme.Shallow:
+            [
+                [64]
+            ],
+
+        # dqn
+        MiddlewareScheme.Medium:
+            [
+                [512]
+            ],
+
+        MiddlewareScheme.Deep: \
+            [
+                [128],
+                [128],
+                [128]
+            ]
+    }
+
+    def __init__(self, activation_function=tf.nn.relu, number_of_lstm_cells: int=256,
+                 scheme: MiddlewareScheme = MiddlewareScheme.Medium,
+                 batchnorm: bool = False, dropout: bool = False,
+                 name="middleware_lstm_embedder"):
+        super().__init__(activation_function=activation_function, batchnorm=batchnorm,
+                         dropout=dropout, scheme=scheme, name=name)
+        self.return_type = Middleware_LSTM_Embedding
+        self.number_of_lstm_cells = number_of_lstm_cells
+        self.layers = []
+
+    def _build_module(self):
+        """
+        self.state_in: tuple of placeholders containing the initial state
+        self.state_out: tuple of output state
+
+        todo: it appears that the shape of the output is batch, feature
+        the code here seems to be slicing off the first element in the batch
+        which would definitely be wrong. need to double check the shape
+        """
+
+        self.layers.append(self.input)
+
+        # optionally insert some dense layers before the LSTM
+        if isinstance(self.scheme, MiddlewareScheme):
+            layers_params = LSTMMiddleware.schemes[self.scheme]
+        else:
+            layers_params = self.scheme
+        for idx, layer_params in enumerate(layers_params):
+            self.layers.append(
+                tf.layers.dense(self.layers[-1], layer_params[0], name='fc{}'.format(idx))
+            )
+
+            self.layers.extend(batchnorm_activation_dropout(self.layers[-1], self.batchnorm,
+                                                            self.activation_function, self.dropout,
+                                                            self.dropout_rate, idx))
+
+        # add the LSTM layer
+        lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(self.number_of_lstm_cells, state_is_tuple=True)
+        self.c_init = np.zeros((1, lstm_cell.state_size.c), np.float32)
+        self.h_init = np.zeros((1, lstm_cell.state_size.h), np.float32)
+        self.state_init = [self.c_init, self.h_init]
+        self.c_in = tf.placeholder(tf.float32, [1, lstm_cell.state_size.c])
+        self.h_in = tf.placeholder(tf.float32, [1, lstm_cell.state_size.h])
+        self.state_in = (self.c_in, self.h_in)
+        rnn_in = tf.expand_dims(self.layers[-1], [0])
+        step_size = tf.shape(self.layers[-1])[:1]
+        state_in = tf.nn.rnn_cell.LSTMStateTuple(self.c_in, self.h_in)
+        lstm_outputs, lstm_state = tf.nn.dynamic_rnn(
+            lstm_cell, rnn_in, initial_state=state_in, sequence_length=step_size, time_major=False)
+        lstm_c, lstm_h = lstm_state
+        self.state_out = (lstm_c[:1, :], lstm_h[:1, :])
+        self.output = tf.reshape(lstm_outputs, [-1, self.number_of_lstm_cells])
diff --git a/rl_coach/architectures/tensorflow_components/middlewares/middleware.py b/rl_coach/architectures/tensorflow_components/middlewares/middleware.py
new file mode 100644
index 0000000..6f59309
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/middlewares/middleware.py
@@ -0,0 +1,68 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+from typing import Type, Union, List
+
+import tensorflow as tf
+from rl_coach.base_parameters import MiddlewareScheme, Parameters
+
+from rl_coach.core_types import MiddlewareEmbedding
+
+
+class MiddlewareParameters(Parameters):
+    def __init__(self, parameterized_class: Type['Middleware'],
+                 activation_function: str='relu', scheme: Union[List, MiddlewareScheme]=MiddlewareScheme.Medium,
+                 batchnorm: bool=False, dropout: bool=False,
+                 name='middleware'):
+        super().__init__()
+        self.activation_function = activation_function
+        self.scheme = scheme
+        self.batchnorm = batchnorm
+        self.dropout = dropout
+        self.name = name
+        self.parameterized_class_name = parameterized_class.__name__
+
+
+class Middleware(object):
+    """
+    A middleware embedder is the middle part of the network. It takes the embeddings from the input embedders,
+    after they were aggregated in some method (for example, concatenation) and passes it through a neural network
+    which can be customizable but shared between the heads of the network
+    """
+    def __init__(self, activation_function=tf.nn.relu,
+                 scheme: MiddlewareScheme = MiddlewareScheme.Medium,
+                 batchnorm: bool = False, dropout: bool = False, name="middleware_embedder"):
+        self.name = name
+        self.input = None
+        self.output = None
+        self.activation_function = activation_function
+        self.batchnorm = batchnorm
+        self.dropout = dropout
+        self.dropout_rate = 0
+        self.scheme = scheme
+        self.return_type = MiddlewareEmbedding
+
+    def __call__(self, input_layer):
+        with tf.variable_scope(self.get_name()):
+            self.input = input_layer
+            self._build_module()
+
+        return self.input, self.output
+
+    def _build_module(self):
+        pass
+
+    def get_name(self):
+        return self.name
diff --git a/rl_coach/architectures/tensorflow_components/shared_variables.py b/rl_coach/architectures/tensorflow_components/shared_variables.py
new file mode 100644
index 0000000..1a3289c
--- /dev/null
+++ b/rl_coach/architectures/tensorflow_components/shared_variables.py
@@ -0,0 +1,121 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import numpy as np
+import tensorflow as tf
+
+
+class SharedRunningStats(object):
+    def __init__(self, replicated_device=None, epsilon=1e-2, name="", create_ops=True):
+        self.sess = None
+        self.name = name
+        self.replicated_device = replicated_device
+        self.epsilon = epsilon
+        self.ops_were_created = False
+        if create_ops:
+            with tf.device(replicated_device):
+                self.create_ops()
+
+    def create_ops(self, shape=[1], clip_values=None):
+        self.clip_values = clip_values
+        with tf.variable_scope(self.name):
+            self._sum = tf.get_variable(
+                dtype=tf.float64,
+                initializer=tf.constant_initializer(0.0),
+                name="running_sum", trainable=False, shape=shape, validate_shape=False,
+                collections=[tf.GraphKeys.GLOBAL_VARIABLES])
+            self._sum_squared = tf.get_variable(
+                dtype=tf.float64,
+                initializer=tf.constant_initializer(self.epsilon),
+                name="running_sum_squared", trainable=False, shape=shape, validate_shape=False,
+                collections=[tf.GraphKeys.GLOBAL_VARIABLES])
+            self._count = tf.get_variable(
+                dtype=tf.float64,
+                shape=(),
+                initializer=tf.constant_initializer(self.epsilon),
+                name="count", trainable=False, collections=[tf.GraphKeys.GLOBAL_VARIABLES])
+
+            self._shape = None
+            self._mean = tf.div(self._sum, self._count, name="mean")
+            self._std = tf.sqrt(tf.maximum((self._sum_squared - self._count*tf.square(self._mean))
+                                           / tf.maximum(self._count-1, 1), self.epsilon), name="stdev")
+            self.tf_mean = tf.cast(self._mean, 'float32')
+            self.tf_std = tf.cast(self._std, 'float32')
+
+            self.new_sum = tf.placeholder(dtype=tf.float64, name='sum')
+            self.new_sum_squared = tf.placeholder(dtype=tf.float64, name='var')
+            self.newcount = tf.placeholder(shape=[], dtype=tf.float64, name='count')
+
+            self._inc_sum = tf.assign_add(self._sum, self.new_sum, use_locking=True)
+            self._inc_sum_squared = tf.assign_add(self._sum_squared, self.new_sum_squared, use_locking=True)
+            self._inc_count = tf.assign_add(self._count, self.newcount, use_locking=True)
+
+            self.raw_obs = tf.placeholder(dtype=tf.float64, name='raw_obs')
+            self.normalized_obs = (self.raw_obs - self._mean) / self._std
+            if self.clip_values is not None:
+                self.clipped_obs = tf.clip_by_value(self.normalized_obs, self.clip_values[0], self.clip_values[1])
+
+            self.ops_were_created = True
+
+    def set_session(self, sess):
+        self.sess = sess
+
+    def push(self, x):
+        x = x.astype('float64')
+        self.sess.run([self._inc_sum, self._inc_sum_squared, self._inc_count],
+                         feed_dict={
+                             self.new_sum: x.sum(axis=0).ravel(),
+                             self.new_sum_squared: np.square(x).sum(axis=0).ravel(),
+                             self.newcount: np.array(len(x), dtype='float64')
+                         })
+        if self._shape is None:
+            self._shape = x.shape
+
+    @property
+    def n(self):
+        return self.sess.run(self._count)
+
+    @property
+    def mean(self):
+        return self.sess.run(self._mean)
+
+    @property
+    def var(self):
+        return self.std ** 2
+
+    @property
+    def std(self):
+        return self.sess.run(self._std)
+
+    @property
+    def shape(self):
+        return self._shape
+
+    @shape.setter
+    def shape(self, val):
+        self._shape = val
+        self.new_sum.set_shape(val)
+        self.new_sum_squared.set_shape(val)
+        self.tf_mean.set_shape(val)
+        self.tf_std.set_shape(val)
+        self._sum.set_shape(val)
+        self._sum_squared.set_shape(val)
+
+    def normalize(self, batch):
+        if self.clip_values is not None:
+            return self.sess.run(self.clipped_obs, feed_dict={self.raw_obs: batch})
+        else:
+            return self.sess.run(self.normalized_obs, feed_dict={self.raw_obs: batch})
diff --git a/rl_coach/base_parameters.py b/rl_coach/base_parameters.py
new file mode 100644
index 0000000..ee6fe63
--- /dev/null
+++ b/rl_coach/base_parameters.py
@@ -0,0 +1,350 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+import inspect
+import json
+import os
+import sys
+import types
+from collections import OrderedDict
+from enum import Enum
+from typing import Dict, List, Union
+
+from rl_coach.core_types import TrainingSteps, EnvironmentSteps, GradientClippingMethod
+from rl_coach.filters.filter import NoInputFilter
+
+
+class Frameworks(Enum):
+    tensorflow = "TensorFlow"
+
+
+class EmbedderScheme(Enum):
+    Empty = "Empty"
+    Shallow = "Shallow"
+    Medium = "Medium"
+    Deep = "Deep"
+
+
+class MiddlewareScheme(Enum):
+    Empty = "Empty"
+    Shallow = "Shallow"
+    Medium = "Medium"
+    Deep = "Deep"
+
+
+class EmbeddingMergerType(Enum):
+    Concat = 0
+    Sum = 1
+    #ConcatDepthWise = 2
+    #Multiply = 3
+
+
+def iterable_to_items(obj):
+    if isinstance(obj, dict) or isinstance(obj, OrderedDict) or isinstance(obj, types.MappingProxyType):
+        items = obj.items()
+    elif isinstance(obj, list):
+        items = enumerate(obj)
+    else:
+        raise ValueError("The given object is not a dict or a list")
+    return items
+
+
+def unfold_dict_or_list(obj: Union[Dict, List, OrderedDict]):
+    """
+    Recursively unfolds all the parameters in dictionaries and lists
+    :param obj: a dictionary or list to unfold
+    :return: the unfolded parameters dictionary
+    """
+    parameters = OrderedDict()
+    items = iterable_to_items(obj)
+    for k, v in items:
+        if isinstance(v, dict) or isinstance(v, list) or isinstance(v, OrderedDict):
+            if 'tensorflow.' not in str(v.__class__):
+                parameters[k] = unfold_dict_or_list(v)
+        elif 'tensorflow.' in str(v.__class__):
+            parameters[k] = v
+        elif hasattr(v, '__dict__'):
+            sub_params = v.__dict__
+            if '__objclass__' not in sub_params.keys():
+                try:
+                    parameters[k] = unfold_dict_or_list(sub_params)
+                except RecursionError:
+                    parameters[k] = sub_params
+                parameters[k]['__class__'] = v.__class__.__name__
+            else:
+                # unfolding this type of object will result in infinite recursion
+                parameters[k] = sub_params
+        else:
+            parameters[k] = v
+    if not isinstance(obj, OrderedDict) and not isinstance(obj, list):
+        parameters = OrderedDict(sorted(parameters.items()))
+    return parameters
+
+
+class Parameters(object):
+    def __setattr__(self, key, value):
+        caller_name = sys._getframe(1).f_code.co_name
+
+        if caller_name != '__init__' and not hasattr(self, key):
+            raise TypeError("Parameter '{}' does not exist in {}. Parameters are only to be defined in a constructor of"
+                            " a class inheriting from Parameters. In order to explicitly register a new parameter "
+                            "outside of a constructor use register_var().".
+                            format(key, self.__class__))
+        object.__setattr__(self, key, value)
+
+    @property
+    def path(self):
+        if hasattr(self, 'parameterized_class_name'):
+            module_path = os.path.relpath(inspect.getfile(self.__class__), os.getcwd())[:-3] + '.py'
+
+            return ':'.join([module_path, self.parameterized_class_name])
+        else:
+            raise ValueError("The parameters class does not have an attached class it parameterizes. "
+                             "The self.parameterized_class_name should be set to the parameterized class.")
+
+    def register_var(self, key, value):
+        if hasattr(self, key):
+            raise TypeError("Cannot register an already existing parameter '{}'. ".format(key))
+        object.__setattr__(self, key, value)
+
+    def __str__(self):
+        result = "\"{}\" {}\n".format(self.__class__.__name__,
+                                   json.dumps(unfold_dict_or_list(self.__dict__), indent=4, default=repr))
+        return result
+
+
+class AlgorithmParameters(Parameters):
+    def __init__(self):
+        # Architecture parameters
+        self.use_accumulated_reward_as_measurement = False
+
+        # Agent parameters
+        self.num_consecutive_playing_steps = EnvironmentSteps(1)
+        self.num_consecutive_training_steps = 1  # TODO: update this to TrainingSteps
+
+        self.heatup_using_network_decisions = False
+        self.discount = 0.99
+        self.apply_gradients_every_x_episodes = 5
+        self.num_steps_between_copying_online_weights_to_target = TrainingSteps(0)
+        self.rate_for_copying_weights_to_target = 1.0
+        self.load_memory_from_file_path = None
+        self.collect_new_data = True
+
+        # HRL / HER related params
+        self.in_action_space = None
+
+        # distributed agents params
+        self.share_statistics_between_workers = True
+
+        # intrinsic reward
+        self.scale_external_reward_by_intrinsic_reward_value = False
+
+
+class PresetValidationParameters(Parameters):
+    def __init__(self):
+        super().__init__()
+
+        # setting a seed will only work for non-parallel algorithms. Parallel algorithms add uncontrollable noise in
+        # the form of different workers starting at different times, and getting different assignments of CPU
+        # time from the OS.
+
+        # Testing parameters
+        self.test = False
+        self.min_reward_threshold = 0
+        self.max_episodes_to_achieve_reward = 1
+        self.num_workers = 1
+        self.reward_test_level = None
+        self.trace_test_levels = None
+        self.trace_max_env_steps = 5000
+
+
+class NetworkParameters(Parameters):
+    def __init__(self):
+        super().__init__()
+        self.framework = Frameworks.tensorflow
+        self.sess = None
+
+        # hardware parameters
+        self.force_cpu = False
+
+        # distributed training options
+        self.num_threads = 1
+        self.synchronize_over_num_threads = 1
+        self.distributed = False
+        self.async_training = False
+        self.shared_optimizer = True
+        self.scale_down_gradients_by_number_of_workers_for_sync_training = True
+
+        # regularization
+        self.clip_gradients = None
+        self.gradients_clipping_method = GradientClippingMethod.ClipByGlobalNorm
+        self.kl_divergence_constraint = None
+        self.l2_regularization = 0
+
+        # learning rate
+        self.learning_rate = 0.00025
+        self.learning_rate_decay_rate = 0
+        self.learning_rate_decay_steps = 0
+
+        # structure
+        self.input_embedders_parameters = []
+        self.embedding_merger_type = EmbeddingMergerType.Concat
+        self.middleware_parameters = None
+        self.heads_parameters = []
+        self.num_output_head_copies = 1
+        self.loss_weights = []
+        self.rescale_gradient_from_head_by_factor = [1]
+        self.use_separate_networks_per_head = False
+        self.optimizer_type = 'Adam'
+        self.optimizer_epsilon = 0.0001
+        self.adam_optimizer_beta1 = 0.9
+        self.adam_optimizer_beta2 = 0.99
+        self.rms_prop_optimizer_decay = 0.9
+        self.batch_size = 32
+        self.replace_mse_with_huber_loss = False
+        self.create_target_network = False
+
+        # Framework support
+        self.tensorflow_support = True
+
+
+class InputEmbedderParameters(Parameters):
+    def __init__(self, activation_function: str='relu', scheme: Union[List, EmbedderScheme]=EmbedderScheme.Medium,
+                 batchnorm: bool=False, dropout=False, name: str='embedder', input_rescaling=None, input_offset=None,
+                 input_clipping=None):
+        super().__init__()
+        self.activation_function = activation_function
+        self.scheme = scheme
+        self.batchnorm = batchnorm
+        self.dropout = dropout
+
+        if input_rescaling is None:
+            input_rescaling = {'image': 255.0, 'vector': 1.0}
+        if input_offset is None:
+            input_offset = {'image': 0.0, 'vector': 0.0}
+
+        self.input_rescaling = input_rescaling
+        self.input_offset = input_offset
+        self.input_clipping = input_clipping
+        self.name = name
+
+    @property
+    def path(self):
+        return {
+            "image": 'image_embedder:ImageEmbedder',
+            "vector": 'vector_embedder:VectorEmbedder'
+        }
+
+
+class VisualizationParameters(Parameters):
+    def __init__(self):
+        super().__init__()
+        # Visualization parameters
+        self.print_summary = True
+        self.dump_csv = True
+        self.dump_gifs = False
+        self.dump_mp4 = False
+        self.dump_signals_to_csv_every_x_episodes = 5
+        self.dump_in_episode_signals = False
+        self.dump_parameters_documentation = True
+        self.render = False
+        self.native_rendering = False
+        self.max_fps_for_human_control = 10
+        self.tensorboard = False
+        self.video_dump_methods = []  # a list of dump methods which will be checked one after the other until the first
+                                      # dump method that returns false for should_dump()
+        self.add_rendered_image_to_env_response = False
+
+
+class AgentParameters(Parameters):
+    def __init__(self, algorithm: AlgorithmParameters, exploration: 'ExplorationParameters', memory: 'MemoryParameters',
+                 networks: Dict[str, NetworkParameters], visualization: VisualizationParameters=VisualizationParameters()):
+        """
+        :param algorithm: the algorithmic parameters
+        :param exploration: the exploration policy parameters
+        :param memory: the memory module parameters
+        :param networks: the parameters for the networks of the agent
+        :param visualization: the visualization parameters
+        """
+        super().__init__()
+        self.visualization = visualization
+        self.algorithm = algorithm
+        self.exploration = exploration
+        self.memory = memory
+        self.network_wrappers = networks
+        self.input_filter = None
+        self.output_filter = None
+        self.pre_network_filter = NoInputFilter()
+        self.full_name_id = None  # TODO: do we really want to hold this parameters here?
+        self.name = None
+        self.is_a_highest_level_agent = True
+        self.is_a_lowest_level_agent = True
+        self.task_parameters = None
+
+    @property
+    def path(self):
+        return 'rl_coach.agents.agent:Agent'
+
+
+class TaskParameters(Parameters):
+    def __init__(self, framework_type: str, evaluate_only: bool=False, use_cpu: bool=False, experiment_path=None,
+                 seed=None):
+        """
+        :param framework_type: deep learning framework type. currently only tensorflow is supported
+        :param evaluate_only: the task will be used only for evaluating the model
+        :param use_cpu: use the cpu for this task
+        :param experiment_path: the path to the directory which will store all the experiment outputs
+        :param seed: a seed to use for the random numbers generator
+        """
+        self.framework_type = framework_type
+        self.task_index = None  # TODO: not really needed
+        self.evaluate_only = evaluate_only
+        self.use_cpu = use_cpu
+        self.experiment_path = experiment_path
+        self.seed = seed
+
+
+class DistributedTaskParameters(TaskParameters):
+    def __init__(self, framework_type: str, parameters_server_hosts: str, worker_hosts: str, job_type: str,
+                 task_index: int, evaluate_only: bool=False, num_tasks: int=None,
+                 num_training_tasks: int=None, use_cpu: bool=False, experiment_path=None, dnd=None,
+                 shared_memory_scratchpad=None, seed=None):
+        """
+        :param framework_type: deep learning framework type. currently only tensorflow is supported
+        :param evaluate_only: the task will be used only for evaluating the model
+        :param parameters_server_hosts: comma-separated list of hostname:port pairs to which the parameter servers are
+                                        assigned
+        :param worker_hosts: comma-separated list of hostname:port pairs to which the workers are assigned
+        :param job_type: the job type - either ps (short for parameters server) or worker
+        :param task_index: the index of the process
+        :param num_tasks: the number of total tasks that are running (not including the parameters server)
+        :param num_training_tasks: the number of tasks that are training (not including the parameters server)
+        :param use_cpu: use the cpu for this task
+        :param experiment_path: the path to the directory which will store all the experiment outputs
+        :param dnd: an external DND to use for NEC. This is a workaround needed for a shared DND not using the scratchpad.
+        :param seed: a seed to use for the random numbers generator
+        """
+        super().__init__(framework_type=framework_type, evaluate_only=evaluate_only, use_cpu=use_cpu,
+                         experiment_path=experiment_path, seed=seed)
+        self.parameters_server_hosts = parameters_server_hosts
+        self.worker_hosts = worker_hosts
+        self.job_type = job_type
+        self.task_index = task_index
+        self.num_tasks = num_tasks
+        self.num_training_tasks = num_training_tasks
+        self.device = None  # the replicated device which will be used for the global parameters
+        self.worker_target = None
+        self.dnd = dnd
+        self.shared_memory_scratchpad = shared_memory_scratchpad
diff --git a/rl_coach/coach.py b/rl_coach/coach.py
new file mode 100644
index 0000000..0e65f04
--- /dev/null
+++ b/rl_coach/coach.py
@@ -0,0 +1,402 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import sys
+sys.path.append('.')
+
+import copy
+from rl_coach.core_types import EnvironmentSteps
+import os
+from rl_coach import logger
+import traceback
+from rl_coach.logger import screen, failed_imports
+import argparse
+import atexit
+import time
+import sys
+from rl_coach.base_parameters import Frameworks, VisualizationParameters, TaskParameters, DistributedTaskParameters
+from multiprocessing import Process
+from multiprocessing.managers import BaseManager
+import subprocess
+from rl_coach.graph_managers.graph_manager import HumanPlayScheduleParameters, GraphManager
+from rl_coach.utils import list_all_presets, short_dynamic_import, get_open_port, SharedMemoryScratchPad, get_base_dir
+from rl_coach.agents.human_agent import HumanAgentParameters
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.environments.environment import SingleLevelSelection
+
+
+if len(set(failed_imports)) > 0:
+    screen.warning("Warning: failed to import the following packages - {}".format(', '.join(set(failed_imports))))
+
+
+def get_graph_manager_from_args(args: argparse.Namespace) -> 'GraphManager':
+    """
+    Return the graph manager according to the command line arguments given by the user
+    :param args: the arguments given by the user
+    :return: the updated graph manager
+    """
+
+    graph_manager = None
+
+    # if a preset was given we will load the graph manager for the preset
+    if args.preset is not None:
+        graph_manager = short_dynamic_import(args.preset, ignore_module_case=True)
+
+    # for human play we need to create a custom graph manager
+    if args.play:
+        env_params = short_dynamic_import(args.environment_type, ignore_module_case=True)()
+        env_params.human_control = True
+        schedule_params = HumanPlayScheduleParameters()
+        graph_manager = BasicRLGraphManager(HumanAgentParameters(), env_params, schedule_params, VisualizationParameters())
+
+    if args.level:
+        if isinstance(graph_manager.env_params.level, SingleLevelSelection):
+            graph_manager.env_params.level.select(args.level)
+        else:
+            graph_manager.env_params.level = args.level
+
+    # set the seed for the environment
+    if args.seed is not None:
+        graph_manager.env_params.seed = args.seed
+
+    # visualization
+    graph_manager.visualization_parameters.dump_gifs = graph_manager.visualization_parameters.dump_gifs or args.dump_gifs
+    graph_manager.visualization_parameters.dump_mp4 = graph_manager.visualization_parameters.dump_mp4 or args.dump_mp4
+    graph_manager.visualization_parameters.render = args.render
+    graph_manager.visualization_parameters.tensorboard = args.tensorboard
+
+    # update the custom parameters
+    if args.custom_parameter is not None:
+        unstripped_key_value_pairs = [pair.split('=') for pair in args.custom_parameter.split(';')]
+        stripped_key_value_pairs = [tuple([pair[0].strip(), pair[1].strip()]) for pair in
+                                    unstripped_key_value_pairs if len(pair) == 2]
+
+        # load custom parameters into run_dict
+        for key, value in stripped_key_value_pairs:
+            exec("graph_manager.{}={}".format(key, value))
+
+    return graph_manager
+
+
+def parse_arguments(parser: argparse.ArgumentParser) -> argparse.Namespace:
+    """
+    Parse the arguments that the user entered
+    :param parser: the argparse command line parser
+    :return: the parsed arguments
+    """
+    args = parser.parse_args()
+
+    # if no arg is given
+    if len(sys.argv) == 1:
+        parser.print_help()
+        exit(0)
+
+    # list available presets
+    preset_names = list_all_presets()
+    if args.list:
+        screen.log_title("Available Presets:")
+        for preset in sorted(preset_names):
+            print(preset)
+        sys.exit(0)
+
+    # replace a short preset name with the full path
+    if args.preset is not None:
+        if args.preset.lower() in [p.lower() for p in preset_names]:
+            args.preset = "{}.py:graph_manager".format(os.path.join(get_base_dir(), 'presets', args.preset))
+        else:
+            args.preset = "{}".format(args.preset)
+
+        # verify that the preset exists
+        preset_path = args.preset.split(":")[0]
+        if not os.path.exists(preset_path):
+            screen.error("The given preset ({}) cannot be found.".format(args.preset))
+
+        # verify that the preset can be instantiated
+        try:
+            short_dynamic_import(args.preset, ignore_module_case=True)
+        except TypeError as e:
+            traceback.print_exc()
+            screen.error('Internal Error: ' + str(e) + "\n\nThe given preset ({}) cannot be instantiated."
+                         .format(args.preset))
+
+    # validate the checkpoints args
+    if args.checkpoint_restore_dir is not None and not os.path.exists(args.checkpoint_restore_dir):
+        screen.error("The requested checkpoint folder to load from does not exist.")
+
+    # no preset was given. check if the user requested to play some environment on its own
+    if args.preset is None and args.play:
+        if args.environment_type:
+            args.agent_type = 'Human'
+        else:
+            screen.error('When no preset is given for Coach to run, and the user requests human control over '
+                         'the environment, the user is expected to input the desired environment_type and level.'
+                         '\nAt least one of these parameters was not given.')
+    elif args.preset and args.play:
+        screen.error("Both the --preset and the --play flags were set. These flags can not be used together. "
+                     "For human control, please use the --play flag together with the environment type flag (-et)")
+    elif args.preset is None and not args.play:
+        screen.error("Please choose a preset using the -p flag or use the --play flag together with choosing an "
+                     "environment type (-et) in order to play the game.")
+
+    # get experiment name and path
+    args.experiment_name = logger.get_experiment_name(args.experiment_name)
+    args.experiment_path = logger.get_experiment_path(args.experiment_name)
+
+    if args.play and args.num_workers > 1:
+        screen.warning("Playing the game as a human is only available with a single worker. "
+                       "The number of workers will be reduced to 1")
+        args.num_workers = 1
+
+    args.framework = Frameworks[args.framework.lower()]
+
+    # checkpoints
+    args.save_checkpoint_dir = os.path.join(args.experiment_path, 'checkpoint') if args.save_checkpoint_secs is not None else None
+
+    return args
+
+
+def add_items_to_dict(target_dict, source_dict):
+    updated_task_parameters = copy.copy(source_dict)
+    updated_task_parameters.update(target_dict)
+    return updated_task_parameters
+
+
+def open_dashboard(experiment_path):
+    dashboard_path = 'python {}/dashboard.py'.format(get_base_dir())
+    cmd = "{} --experiment_dir {}".format(dashboard_path, experiment_path)
+    screen.log_title("Opening dashboard - experiment path: {}".format(experiment_path))
+    # subprocess.Popen(cmd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, shell=True, executable="/bin/bash")
+    subprocess.Popen(cmd, shell=True, executable="/bin/bash")
+
+
+def start_graph(graph_manager: 'GraphManager', task_parameters: 'TaskParameters'):
+    graph_manager.create_graph(task_parameters)
+
+    # let the adventure begin
+    if task_parameters.evaluate_only:
+        graph_manager.evaluate(EnvironmentSteps(sys.maxsize), keep_networks_in_sync=True)
+    else:
+        graph_manager.improve()
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('-p', '--preset',
+                        help="(string) Name of a preset to run (class name from the 'presets' directory.)",
+                        default=None,
+                        type=str)
+    parser.add_argument('-l', '--list',
+                        help="(flag) List all available presets",
+                        action='store_true')
+    parser.add_argument('-e', '--experiment_name',
+                        help="(string) Experiment name to be used to store the results.",
+                        default='',
+                        type=str)
+    parser.add_argument('-r', '--render',
+                        help="(flag) Render environment",
+                        action='store_true')
+    parser.add_argument('-f', '--framework',
+                        help="(string) Neural network framework. Available values: tensorflow",
+                        default='tensorflow',
+                        type=str)
+    parser.add_argument('-n', '--num_workers',
+                        help="(int) Number of workers for multi-process based agents, e.g. A3C",
+                        default=1,
+                        type=int)
+    parser.add_argument('-c', '--use_cpu',
+                        help="(flag) Use only the cpu for training. If a GPU is not available, this flag will have no "
+                             "effect and the CPU will be used either way.",
+                        action='store_true')
+    parser.add_argument('-ew', '--evaluation_worker',
+                        help="(int) If multiple workers are used, add an evaluation worker as well which will "
+                             "evaluate asynchronously and independently during the training. NOTE: this worker will "
+                             "ignore the evaluation settings in the preset's ScheduleParams.",
+                        action='store_true')
+    parser.add_argument('--play',
+                        help="(flag) Play as a human by controlling the game with the keyboard. "
+                             "This option will save a replay buffer with the game play.",
+                        action='store_true')
+    parser.add_argument('--evaluate',
+                        help="(flag) Run evaluation only. This is a convenient way to disable "
+                             "training in order to evaluate an existing checkpoint.",
+                        action='store_true')
+    parser.add_argument('-v', '--verbosity',
+                        help="(flag) Sets the verbosity level of Coach print outs. Can be either low or high.",
+                        default="low",
+                        type=str)
+    parser.add_argument('-tfv', '--tf_verbosity',
+                        help="(flag) TensorFlow verbosity level",
+                        default=3,
+                        type=int)
+    parser.add_argument('-s', '--save_checkpoint_secs',
+                        help="(int) Time in seconds between saving checkpoints of the model.",
+                        default=None,
+                        type=int)
+    parser.add_argument('-crd', '--checkpoint_restore_dir',
+                        help='(string) Path to a folder containing a checkpoint to restore the model from.',
+                        type=str)
+    parser.add_argument('-dg', '--dump_gifs',
+                        help="(flag) Enable the gif saving functionality.",
+                        action='store_true')
+    parser.add_argument('-dm', '--dump_mp4',
+                        help="(flag) Enable the mp4 saving functionality.",
+                        action='store_true')
+    parser.add_argument('-at', '--agent_type',
+                        help="(string) Choose an agent type class to override on top of the selected preset. "
+                             "If no preset is defined, a preset can be set from the command-line by combining settings "
+                             "which are set by using --agent_type, --experiment_type, --environemnt_type",
+                        default=None,
+                        type=str)
+    parser.add_argument('-et', '--environment_type',
+                        help="(string) Choose an environment type class to override on top of the selected preset."
+                             "If no preset is defined, a preset can be set from the command-line by combining settings "
+                             "which are set by using --agent_type, --experiment_type, --environemnt_type",
+                        default=None,
+                        type=str)
+    parser.add_argument('-ept', '--exploration_policy_type',
+                        help="(string) Choose an exploration policy type class to override on top of the selected "
+                             "preset."
+                             "If no preset is defined, a preset can be set from the command-line by combining settings "
+                             "which are set by using --agent_type, --experiment_type, --environemnt_type"
+                        ,
+                        default=None,
+                        type=str)
+    parser.add_argument('-lvl', '--level',
+                        help="(string) Choose the level that will be played in the environment that was selected."
+                             "This value will override the level parameter in the environment class."
+                        ,
+                        default=None,
+                        type=str)
+    parser.add_argument('-cp', '--custom_parameter',
+                        help="(string) Semicolon separated parameters used to override specific parameters on top of"
+                             " the selected preset (or on top of the command-line assembled one). "
+                             "Whenever a parameter value is a string, it should be inputted as '\\\"string\\\"'. "
+                             "For ex.: "
+                             "\"visualization.render=False; num_training_iterations=500; optimizer='rmsprop'\"",
+                        default=None,
+                        type=str)
+    parser.add_argument('--print_parameters',
+                        help="(flag) Print tuning_parameters to stdout",
+                        action='store_true')
+    parser.add_argument('-tb', '--tensorboard',
+                        help="(flag) When using the TensorFlow backend, enable TensorBoard log dumps. ",
+                        action='store_true')
+    parser.add_argument('-ns', '--no_summary',
+                        help="(flag) Prevent Coach from printing a summary and asking questions at the end of runs",
+                        action='store_true')
+    parser.add_argument('-d', '--open_dashboard',
+                        help="(flag) Open dashboard with the experiment when the run starts",
+                        action='store_true')
+    parser.add_argument('--seed',
+                        help="(int) A seed to use for running the experiment",
+                        default=None,
+                        type=int)
+
+    args = parse_arguments(parser)
+
+    graph_manager = get_graph_manager_from_args(args)
+
+    # Intel optimized TF seems to run significantly faster when limiting to a single OMP thread.
+    # This will not affect GPU runs.
+    os.environ["OMP_NUM_THREADS"] = "1"
+
+    # turn TF debug prints off
+    if args.framework == Frameworks.tensorflow:
+        os.environ['TF_CPP_MIN_LOG_LEVEL'] = str(args.tf_verbosity)
+
+    # turn off the summary at the end of the run if necessary
+    if not args.no_summary:
+        atexit.register(logger.summarize_experiment)
+        screen.change_terminal_title(args.experiment_name)
+
+    # open dashboard
+    if args.open_dashboard:
+        open_dashboard(args.experiment_path)
+
+    # Single-threaded runs
+    if args.num_workers == 1:
+        # Start the training or evaluation
+        task_parameters = TaskParameters(framework_type="tensorflow",  # TODO: tensorflow should'nt be hardcoded
+                                         evaluate_only=args.evaluate,
+                                         experiment_path=args.experiment_path,
+                                         seed=args.seed,
+                                         use_cpu=args.use_cpu)
+        task_parameters.__dict__ = add_items_to_dict(task_parameters.__dict__, args.__dict__)
+
+        start_graph(graph_manager=graph_manager, task_parameters=task_parameters)
+
+    # Multi-threaded runs
+    else:
+        total_tasks = args.num_workers
+        if args.evaluation_worker:
+            total_tasks += 1
+
+        ps_hosts = "localhost:{}".format(get_open_port())
+        worker_hosts = ",".join(["localhost:{}".format(get_open_port()) for i in range(total_tasks)])
+
+        # Shared memory
+        class CommManager(BaseManager):
+            pass
+        CommManager.register('SharedMemoryScratchPad', SharedMemoryScratchPad, exposed=['add', 'get', 'internal_call'])
+        comm_manager = CommManager()
+        comm_manager.start()
+        shared_memory_scratchpad = comm_manager.SharedMemoryScratchPad()
+
+        def start_distributed_task(job_type, task_index, evaluation_worker=False,
+                                   shared_memory_scratchpad=shared_memory_scratchpad):
+            task_parameters = DistributedTaskParameters(framework_type="tensorflow", # TODO: tensorflow should'nt be hardcoded
+                                                        parameters_server_hosts=ps_hosts,
+                                                        worker_hosts=worker_hosts,
+                                                        job_type=job_type,
+                                                        task_index=task_index,
+                                                        evaluate_only=evaluation_worker,
+                                                        use_cpu=args.use_cpu,
+                                                        num_tasks=total_tasks,  # training tasks + 1 evaluation task
+                                                        num_training_tasks=args.num_workers,
+                                                        experiment_path=args.experiment_path,
+                                                        shared_memory_scratchpad=shared_memory_scratchpad,
+                                                        seed=args.seed+task_index if args.seed is not None else None)  # each worker gets a different seed
+            task_parameters.__dict__ = add_items_to_dict(task_parameters.__dict__, args.__dict__)
+            # we assume that only the evaluation workers are rendering
+            graph_manager.visualization_parameters.render = args.render and evaluation_worker
+            p = Process(target=start_graph, args=(graph_manager, task_parameters))
+            # p.daemon = True
+            p.start()
+            return p
+
+        # parameter server
+        parameter_server = start_distributed_task("ps", 0)
+
+        # training workers
+        # wait a bit before spawning the non chief workers in order to make sure the session is already created
+        workers = []
+        workers.append(start_distributed_task("worker", 0))
+        time.sleep(2)
+        for task_index in range(1, args.num_workers):
+            workers.append(start_distributed_task("worker", task_index))
+
+        # evaluation worker
+        if args.evaluation_worker:
+            evaluation_worker = start_distributed_task("worker", args.num_workers, evaluation_worker=True)
+
+        # wait for all workers
+        [w.join() for w in workers]
+        if args.evaluation_worker:
+            evaluation_worker.terminate()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/rl_coach/core_types.py b/rl_coach/core_types.py
new file mode 100644
index 0000000..f75ee5f
--- /dev/null
+++ b/rl_coach/core_types.py
@@ -0,0 +1,687 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+
+from enum import Enum
+from typing import List, Union, Dict, Any, Type
+from random import shuffle
+
+import numpy as np
+import copy
+
+ActionType = Union[int, float, np.ndarray, List]
+GoalType = Union[None, np.ndarray]
+ObservationType = np.ndarray
+RewardType = Union[int, float, np.ndarray]
+StateType = Dict[str, np.ndarray]
+
+
+class GoalTypes(Enum):
+    Embedding = 1
+    EmbeddingChange = 2
+    Observation = 3
+    Measurements = 4
+
+
+# step methods
+
+class StepMethod(object):
+    def __init__(self, num_steps: int):
+        self._num_steps = self.num_steps = num_steps
+
+    @property
+    def num_steps(self) -> int:
+        return self._num_steps
+
+    @num_steps.setter
+    def num_steps(self, val: int) -> None:
+        self._num_steps = val
+
+
+class Frames(StepMethod):
+    def __init__(self, num_steps):
+        super().__init__(num_steps)
+
+
+class EnvironmentSteps(StepMethod):
+    def __init__(self, num_steps):
+        super().__init__(num_steps)
+
+
+class EnvironmentEpisodes(StepMethod):
+    def __init__(self, num_steps):
+        super().__init__(num_steps)
+
+
+class TrainingSteps(StepMethod):
+    def __init__(self, num_steps):
+        super().__init__(num_steps)
+
+
+class Time(StepMethod):
+    def __init__(self, num_steps):
+        super().__init__(num_steps)
+
+
+class PredictionType(object):
+    pass
+
+
+class VStateValue(PredictionType):
+    pass
+
+
+class QActionStateValue(PredictionType):
+    pass
+
+
+class ActionProbabilities(PredictionType):
+    pass
+
+
+class Embedding(PredictionType):
+    pass
+
+
+class InputEmbedding(Embedding):
+    pass
+
+
+class MiddlewareEmbedding(Embedding):
+    pass
+
+
+class InputImageEmbedding(InputEmbedding):
+    pass
+
+
+class InputVectorEmbedding(InputEmbedding):
+    pass
+
+
+class Middleware_FC_Embedding(MiddlewareEmbedding):
+    pass
+
+
+class Middleware_LSTM_Embedding(MiddlewareEmbedding):
+    pass
+
+
+class Measurements(PredictionType):
+    pass
+
+PlayingStepsType = Union[EnvironmentSteps, EnvironmentEpisodes, Frames]
+
+
+# run phases
+class RunPhase(Enum):
+    HEATUP = "Heatup"
+    TRAIN = "Training"
+    TEST = "Testing"
+    UNDEFINED = "Undefined"
+
+
+# transitions
+
+class Transition(object):
+    def __init__(self, state: Dict[str, np.ndarray]=None, action: ActionType=None, reward: RewardType=None,
+                 next_state: Dict[str, np.ndarray]=None, game_over: bool=None, info: Dict=None):
+        """
+        A transition is a tuple containing the information of a single step of interaction
+        between the agent and the environment. The most basic version should contain the following values:
+        (current state, action, reward, next state, game over)
+        For imitation learning algorithms, if the reward, next state or game over is not known,
+        it is sufficient to store the current state and action taken by the expert.
+
+        :param state: The current state. Assumed to be a dictionary where the observation
+                      is located at state['observation']
+        :param action: The current action that was taken
+        :param reward: The reward received from the environment
+        :param next_state: The next state of the environment after applying the action.
+                           The next state should be similar to the state in its structure.
+        :param game_over: A boolean which should be True if the episode terminated after
+                          the execution of the action.
+        :param info: A dictionary containing any additional information to be stored in the transition
+        """
+
+        self._state = self.state = state
+        self._action = self.action = action
+        self._reward = self.reward = reward
+        self._total_return = self.total_return = None
+        if not next_state:
+            next_state = state
+        self._next_state = self._next_state = next_state
+        self._game_over = self.game_over = game_over
+        if info is None:
+            self.info = {}
+        else:
+            self.info = info
+
+    def __repr__(self):
+        return str(self.__dict__)
+
+    @property
+    def state(self):
+        if self._state is None:
+            raise Exception("The state was not filled by any of the modules between the environment and the agent")
+        return self._state
+
+    @state.setter
+    def state(self, val):
+        self._state = val
+
+    @property
+    def action(self):
+        if self._action is None:
+            raise Exception("The action was not filled by any of the modules between the environment and the agent")
+        return self._action
+
+    @action.setter
+    def action(self, val):
+        self._action = val
+
+    @property
+    def reward(self):
+
+        if self._reward is None:
+            raise Exception("The reward was not filled by any of the modules between the environment and the agent")
+        return self._reward
+
+    @reward.setter
+    def reward(self, val):
+        self._reward = val
+
+    @property
+    def total_return(self):
+        if self._total_return is None:
+            raise Exception("The total_return was not filled by any of the modules between the environment and the "
+                            "agent.  Make sure that you are using an episodic experience replay.")
+        return self._total_return
+
+    @total_return.setter
+    def total_return(self, val):
+        self._total_return = val
+
+    @property
+    def game_over(self):
+        if self._game_over is None:
+            raise Exception("The done flag was not filled by any of the modules between the environment and the agent")
+        return self._game_over
+
+    @game_over.setter
+    def game_over(self, val):
+        self._game_over = val
+
+    @property
+    def next_state(self):
+        if self._next_state is None:
+            raise Exception("The next state was not filled by any of the modules between the environment and the agent")
+        return self._next_state
+
+    @next_state.setter
+    def next_state(self, val):
+        self._next_state = val
+
+    def add_info(self, new_info: Dict[str, Any]) -> None:
+        if not new_info.keys().isdisjoint(self.info.keys()):
+            raise ValueError("The new info dictionary can not be appended to the existing info dictionary since there "
+                             "are overlapping keys between the two. old keys: {}, new keys: {}"
+                             .format(self.info.keys(), new_info.keys()))
+        self.info.update(new_info)
+
+    def __copy__(self):
+        new_transition = type(self)()
+        new_transition.__dict__.update(self.__dict__)
+        new_transition.state = copy.copy(new_transition.state)
+        new_transition.next_state = copy.copy(new_transition.next_state)
+        new_transition.info = copy.copy(new_transition.info)
+        return new_transition
+
+
+class EnvResponse(object):
+    def __init__(self, next_state: Dict[str, ObservationType], reward: RewardType, game_over: bool, info: Dict=None,
+                 goal: ObservationType=None):
+        """
+        An env response is a collection containing the information returning from the environment after a single action
+        has been performed on it.
+        :param next_state: The new state that the environment has transitioned into. Assumed to be a dictionary where the
+                          observation is located at state['observation']
+        :param reward: The reward received from the environment
+        :param game_over: A boolean which should be True if the episode terminated after
+                          the execution of the action.
+        :param info: any additional info from the environment
+        :param goal: a goal defined by the environment
+        """
+        self._next_state = self.next_state = next_state
+        self._reward = self.reward = reward
+        self._game_over = self.game_over = game_over
+        self._goal = self.goal = goal
+        if info is None:
+            self.info = {}
+        else:
+            self.info = info
+
+    def __repr__(self):
+        return str(self.__dict__)
+
+    @property
+    def next_state(self):
+        return self._next_state
+
+    @next_state.setter
+    def next_state(self, val):
+        self._next_state = val
+
+    @property
+    def reward(self):
+        return self._reward
+
+    @reward.setter
+    def reward(self, val):
+        self._reward = val
+
+    @property
+    def game_over(self):
+        return self._game_over
+
+    @game_over.setter
+    def game_over(self, val):
+        self._game_over = val
+
+    @property
+    def goal(self):
+        return self._goal
+
+    @goal.setter
+    def goal(self, val):
+        self._goal = val
+
+    def add_info(self, info: Dict[str, Any]) -> None:
+        if info.keys().isdisjoint(self.info.keys()):
+            raise ValueError("The new info dictionary can not be appended to the existing info dictionary since there"
+                             "are overlapping keys between the two")
+        self.info.update(info)
+
+
+class ActionInfo(object):
+    """
+    Action info is a class that holds an action and various additional information details about it
+    """
+    def __init__(self, action: ActionType, action_probability: float=0,
+                 action_value: float=0., state_value: float=0., max_action_value: float=None,
+                 action_intrinsic_reward: float=0):
+        """
+        :param action: the action
+        :param action_probability: the probability that the action was given when selecting it
+        :param action_value: the state-action value (Q value) of the action
+        :param state_value: the state value (V value) of the state where the action was taken
+        :param max_action_value: in case this is an action that was selected randomly, this is the value of the action
+                                 that received the maximum value. if no value is given, the action is assumed to be the
+                                 action with the maximum value
+        :param action_intrinsic_reward: can contain any intrinsic reward that the agent wants to add to this action
+                                        selection
+        """
+        self.action = action
+        self.action_probability = action_probability
+        self.action_value = action_value
+        self.state_value = state_value
+        if not max_action_value:
+            self.max_action_value = action_value
+        else:
+            self.max_action_value = max_action_value
+        self.action_intrinsic_reward = action_intrinsic_reward
+
+
+class Batch(object):
+    def __init__(self, transitions: List[Transition]):
+        """
+        A wrapper around a list of transitions that helps extracting batches of parameters from it.
+        For example, one can extract a list of states corresponding to the list of transitions.
+        The class uses lazy evaluation in order to return each of the available parameters.
+        :param transitions: a list of transitions to extract the batch from
+        """
+        self.transitions = transitions
+        self._states = {}
+        self._actions = None
+        self._rewards = None
+        self._total_returns = None
+        self._game_overs = None
+        self._next_states = {}
+        self._goals = None
+        self._info = {}
+
+    def slice(self, start, end) -> None:
+        """
+        Keep a slice from the batch and discard the rest of the batch
+        :param start: the start index in the slice
+        :param end: the end index in the slice
+        :return: None
+        """
+
+        self.transitions = self.transitions[start:end]
+        for k, v in self._states.items():
+            self._states[k] = v[start:end]
+        if self._actions is not None:
+            self._actions = self._actions[start:end]
+        if self._rewards is not None:
+            self._rewards = self._rewards[start:end]
+        if self._total_returns is not None:
+            self._total_returns = self._total_returns[start:end]
+        if self._game_overs is not None:
+            self._game_overs = self._game_overs[start:end]
+        for k, v in self._next_states.items():
+            self._next_states[k] = v[start:end]
+        if self._goals is not None:
+            self._goals = self._goals[start:end]
+        for k, v in self._info.items():
+            self._info[k] = v[start:end]
+
+    def shuffle(self) -> None:
+        """
+        Shuffle all the transitions in the batch
+        :return: None
+        """
+        batch_order = list(range(self.size))
+        shuffle(batch_order)
+        self.transitions = [self.transitions[i] for i in batch_order]
+        self._states = {}
+        self._actions = None
+        self._rewards = None
+        self._total_returns = None
+        self._game_overs = None
+        self._next_states = {}
+        self._goals = None
+        self._info = {}
+
+        # This seems to be slower
+        # for k, v in self._states.items():
+        #     self._states[k] = [v[i] for i in batch_order]
+        # if self._actions is not None:
+        #     self._actions = [self._actions[i] for i in batch_order]
+        # if self._rewards is not None:
+        #     self._rewards = [self._rewards[i] for i in batch_order]
+        # if self._total_returns is not None:
+        #     self._total_returns = [self._total_returns[i] for i in batch_order]
+        # if self._game_overs is not None:
+        #     self._game_overs = [self._game_overs[i] for i in batch_order]
+        # for k, v in self._next_states.items():
+        #     self._next_states[k] = [v[i] for i in batch_order]
+        # if self._goals is not None:
+        #     self._goals = [self._goals[i] for i in batch_order]
+        # for k, v in self._info.items():
+        #     self._info[k] = [v[i] for i in batch_order]
+
+    def states(self, fetches: List[str], expand_dims=False) -> Dict[str, np.ndarray]:
+        """
+        follow the keys in fetches to extract the corresponding items from the states in the batch
+        if these keys were not already extracted before. return only the values corresponding to those keys
+        :param fetches: the keys of the state dictionary to extract
+        :param expand_dims: add an extra dimension to each of the value batches
+        :return: a dictionary containing a batch of values correponding to each of the given fetches keys
+        """
+        current_states = {}
+        # there are cases (e.g. ddpg) where the state does not contain all the information needed for running
+        # through the network and this has to be added externally (e.g. ddpg where the action needs to be given in
+        # addition to the current_state, so that all the inputs of the network will be filled)
+        for key in set(fetches).intersection(self.transitions[0].state.keys()):
+            if key not in self._states.keys():
+                self._states[key] = np.array([np.array(transition.state[key]) for transition in self.transitions])
+            if expand_dims:
+                current_states[key] = np.expand_dims(self._states[key], -1)
+            else:
+                current_states[key] = self._states[key]
+        return current_states
+
+    def actions(self, expand_dims=False) -> np.ndarray:
+        """
+        if the actions were not converted to a batch before, extract them to a batch and then return the batch
+        :param expand_dims: add an extra dimension to the actions batch
+        :return: a numpy array containing all the actions of the batch
+        """
+        if self._actions is None:
+            self._actions = np.array([transition.action for transition in self.transitions])
+        if expand_dims:
+            return np.expand_dims(self._actions, -1)
+        return self._actions
+
+    def rewards(self, expand_dims=False) -> np.ndarray:
+        """
+        if the rewards were not converted to a batch before, extract them to a batch and then return the batch
+        :param expand_dims: add an extra dimension to the rewards batch
+        :return: a numpy array containing all the rewards of the batch
+        """
+        if self._rewards is None:
+            self._rewards = np.array([transition.reward for transition in self.transitions])
+        if expand_dims:
+            return np.expand_dims(self._rewards, -1)
+        return self._rewards
+
+    def total_returns(self, expand_dims=False) -> np.ndarray:
+        """
+        if the total_returns were not converted to a batch before, extract them to a batch and then return the batch
+        if the total return was not filled, this will raise an exception
+        :param expand_dims: add an extra dimension to the total_returns batch
+        :return: a numpy array containing all the total return values of the batch
+        """
+        if self._total_returns is None:
+            self._total_returns = np.array([transition.total_return for transition in self.transitions])
+        if expand_dims:
+            return np.expand_dims(self._total_returns, -1)
+        return self._total_returns
+
+    def game_overs(self, expand_dims=False) -> np.ndarray:
+        """
+        if the game_overs were not converted to a batch before, extract them to a batch and then return the batch
+        :param expand_dims: add an extra dimension to the game_overs batch
+        :return: a numpy array containing all the game over flags of the batch
+        """
+        if self._game_overs is None:
+            self._game_overs = np.array([transition.game_over for transition in self.transitions])
+        if expand_dims:
+            return np.expand_dims(self._game_overs, -1)
+        return self._game_overs
+
+    def next_states(self, fetches: List[str], expand_dims=False) -> Dict[str, np.ndarray]:
+        """
+        follow the keys in fetches to extract the corresponding items from the next states in the batch
+        if these keys were not already extracted before. return only the values corresponding to those keys
+        :param fetches: the keys of the state dictionary to extract
+        :param expand_dims: add an extra dimension to each of the value batches
+        :return: a dictionary containing a batch of values correponding to each of the given fetches keys
+        """
+        next_states = {}
+        # there are cases (e.g. ddpg) where the state does not contain all the information needed for running
+        # through the network and this has to be added externally (e.g. ddpg where the action needs to be given in
+        # addition to the current_state, so that all the inputs of the network will be filled)
+        for key in set(fetches).intersection(self.transitions[0].next_state.keys()):
+            if key not in self._next_states.keys():
+                self._next_states[key] = np.array([np.array(transition.next_state[key]) for transition in self.transitions])
+            if expand_dims:
+                next_states[key] = np.expand_dims(self._next_states[key], -1)
+            else:
+                next_states[key] = self._next_states[key]
+        return next_states
+
+    def goals(self, expand_dims=False) -> np.ndarray:
+        """
+        if the goals were not converted to a batch before, extract them to a batch and then return the batch
+        if the goal was not filled, this will raise an exception
+        :param expand_dims: add an extra dimension to the goals batch
+        :return: a numpy array containing all the goals of the batch
+        """
+        if self._goals is None:
+            self._goals = np.array([transition.goal for transition in self.transitions])
+        if expand_dims:
+            return np.expand_dims(self._goals, -1)
+        return self._goals
+
+    def info(self, key, expand_dims=False) -> np.ndarray:
+        """
+        if the given info dictionary key was not converted to a batch before, extract it to a batch and then return the
+        batch. if the key is not part of the keys in the info dictionary, this will raise an exception
+        :param expand_dims: add an extra dimension to the info batch
+        :return: a numpy array containing all the info values of the batch corresponding to the given key
+        """
+        if key not in self._info.keys():
+            self._info[key] = np.array([transition.info[key] for transition in self.transitions])
+        if expand_dims:
+            return np.expand_dims(self._info[key], -1)
+        return self._info[key]
+
+    @property
+    def size(self) -> int:
+        """
+        :return: the size of the batch
+        """
+        return len(self.transitions)
+
+    def __getitem__(self, key):
+        """
+        get an item from the transitions list
+        :param key: index of the transition in the batch
+        :return: the transition corresponding to the given index
+        """
+        return self.transitions[key]
+
+    def __setitem__(self, key, item):
+        """
+        set an item in the transition list
+        :param key: index of the transition in the batch
+        :param item: the transition to place in the given index
+        :return: None
+        """
+        self.transitions[key] = item
+
+
+class TotalStepsCounter(object):
+    """
+    A wrapper around a dictionary counting different StepMethods steps done.
+    """
+    def __init__(self):
+        self.counters = {
+            EnvironmentEpisodes: 0,
+            EnvironmentSteps: 0,
+            TrainingSteps: 0
+        }
+
+    def __getitem__(self, key: Type[StepMethod]) -> int:
+        """
+        get counter value
+        :param key: counter type
+        :return: the counter value
+        """
+        return self.counters[key]
+
+    def __setitem__(self, key: StepMethod, item: int) -> None:
+        """
+        set an item in the transition list
+        :param key: counter type
+        :param item: an integer representing the new counter value
+        :return: None
+        """
+        self.counters[key] = item
+
+
+class GradientClippingMethod(Enum):
+    ClipByGlobalNorm = 0
+    ClipByNorm = 1
+    ClipByValue = 2
+
+
+class Episode(object):
+    def __init__(self, discount: float=0.99, bootstrap_total_return_from_old_policy: bool=False, n_step: int=-1):
+        """
+        :param discount: the discount factor to use when calculating total returns
+        :param bootstrap_total_return_from_old_policy: should the total return be bootstrapped from the values in the
+                                                       memory
+        :param n_step: the number of future steps to sum the reward over before bootstrapping
+        """
+        self.transitions = []
+        # a num_transitions x num_transitions table with the n step return in the n'th row
+        self.returns_table = None
+        self._length = 0
+        self.discount = discount
+        self.bootstrap_total_return_from_old_policy = bootstrap_total_return_from_old_policy
+        self.n_step = n_step
+        self.is_complete = False
+
+    def insert(self, transition):
+        self.transitions.append(transition)
+        self._length += 1
+
+    def is_empty(self):
+        return self.length() == 0
+
+    def length(self):
+        return self._length
+
+    def get_transition(self, transition_idx):
+        return self.transitions[transition_idx]
+
+    def get_last_transition(self):
+        return self.get_transition(-1) if self.length() > 0 else None
+
+    def get_first_transition(self):
+        return self.get_transition(0) if self.length() > 0 else None
+
+    def update_returns(self):
+        if self.n_step == -1 or self.n_step > self.length():
+            self.n_step = self.length()
+        rewards = np.array([t.reward for t in self.transitions])
+        rewards = rewards.astype('float')
+        total_return = rewards.copy()
+        current_discount = self.discount
+        for i in range(1, self.n_step):
+            total_return += current_discount * np.pad(rewards[i:], (0, i), 'constant', constant_values=0)
+            current_discount *= self.discount
+
+        # calculate the bootstrapped returns
+        if self.bootstrap_total_return_from_old_policy:
+            bootstraps = np.array([np.squeeze(t.info['max_action_value']) for t in self.transitions[self.n_step:]])
+            bootstrapped_return = total_return + current_discount * np.pad(bootstraps, (0, self.n_step), 'constant',
+                                                                           constant_values=0)
+            total_return = bootstrapped_return
+
+        for transition_idx in range(self.length()):
+            self.transitions[transition_idx].total_return = total_return[transition_idx]
+
+    def update_actions_probabilities(self):
+        probability_product = 1
+        for transition_idx, transition in enumerate(self.transitions):
+            if 'action_probabilities' in transition.info.keys():
+                probability_product *= transition.info['action_probabilities']
+        for transition_idx, transition in enumerate(self.transitions):
+            transition.info['probability_product'] = probability_product
+
+    def get_returns_table(self):
+        return self.returns_table
+
+    def get_returns(self):
+        return self.get_transitions_attribute('total_return')
+
+    def get_transitions_attribute(self, attribute_name):
+        if len(self.transitions) > 0 and hasattr(self.transitions[0], attribute_name):
+            return [getattr(t, attribute_name) for t in self.transitions]
+        elif len(self.transitions) == 0:
+            return []
+        else:
+            raise ValueError("The transitions have no such attribute name")
+
+    def to_batch(self):
+        batch = []
+        for i in range(self.length()):
+            batch.append(self.get_transition(i))
+        return batch
\ No newline at end of file
diff --git a/dashboard.py b/rl_coach/dashboard.py
similarity index 78%
rename from dashboard.py
rename to rl_coach/dashboard.py
index 4c24099..da8eb1a 100644
--- a/dashboard.py
+++ b/rl_coach/dashboard.py
@@ -18,12 +18,16 @@
 To run Coach Dashboard, run the following command:
 python3 dashboard.py
 """
+
+import sys
+sys.path.append('.')
+
 import os
 
-from dashboard_components.experiment_board import display_directory_group, display_files, averaging_slider_dummy_source
-from dashboard_components.globals import doc
-import dashboard_components.boards  # needed for setting the layouts global variable
-from dashboard_components.landing_page import landing_page
+from rl_coach.dashboard_components.experiment_board import display_directory_group, display_files
+from rl_coach.dashboard_components.globals import doc
+import rl_coach.dashboard_components.boards
+from rl_coach.dashboard_components.landing_page import landing_page
 
 doc.add_root(landing_page)
 
@@ -49,10 +53,12 @@ elif args.experiment_files:
         files.extend(glob.glob(args.experiment_files))
     doc.add_timeout_callback(lambda: display_files(files), 1000)
 
-if __name__ == "__main__":
-    from utils import get_open_port
 
-    command = 'bokeh serve --show dashboard.py --port {}'.format(get_open_port())
+def main():
+    from rl_coach.utils import get_open_port
+
+    dashboard_path = os.path.realpath(__file__)
+    command = 'bokeh serve --show {} --port {}'.format(dashboard_path, get_open_port())
     if args.experiment_dir or args.experiment_files:
         command += ' --args'
         if args.experiment_dir:
@@ -61,3 +67,7 @@ if __name__ == "__main__":
             command += ' --experiment_files {}'.format(args.experiment_files)
 
     os.system(command)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/rl_coach/dashboard_components/__init__.py b/rl_coach/dashboard_components/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/dashboard_components/boards.py b/rl_coach/dashboard_components/boards.py
new file mode 100644
index 0000000..f582849
--- /dev/null
+++ b/rl_coach/dashboard_components/boards.py
@@ -0,0 +1,21 @@
+from bokeh.layouts import column
+from bokeh.models.widgets import Panel, Tabs
+from rl_coach.dashboard_components.experiment_board import experiment_board_layout
+from rl_coach.dashboard_components.episodic_board import episodic_board_layout
+from rl_coach.dashboard_components.globals import spinner, layouts
+from bokeh.models.widgets import Div
+
+# ---------------- Build Website Layout -------------------
+
+# title
+title = Div(text="""<h1>Coach Dashboard</h1>""")
+center = Div(text="""<style>html { padding-left: 50px; } </style>""")
+tab1 = Panel(child=experiment_board_layout, title='experiment board')
+# tab2 = Panel(child=episodic_board_layout, title='episodic board')
+# tabs = Tabs(tabs=[tab1, tab2])
+tabs = Tabs(tabs=[tab1])
+
+layout = column(title, center, tabs)
+layout = column(layout, spinner)
+
+layouts['boards'] = layout
diff --git a/rl_coach/dashboard_components/episodic_board.py b/rl_coach/dashboard_components/episodic_board.py
new file mode 100644
index 0000000..7bbc0da
--- /dev/null
+++ b/rl_coach/dashboard_components/episodic_board.py
@@ -0,0 +1,99 @@
+
+from bokeh.layouts import row, column, widgetbox, Spacer
+from bokeh.models import ColumnDataSource, Range1d, LinearAxis, Legend
+from bokeh.models.widgets import RadioButtonGroup, MultiSelect, Button, Select, Slider, Div, CheckboxGroup, Toggle
+from bokeh.plotting import figure
+from rl_coach.dashboard_components.globals import layouts, crcolor, crx, cry, color_resolution, crRGBs
+from rl_coach.dashboard_components.experiment_board import file_selection_button, files_selector_spacer, \
+    group_selection_button, unload_file_button, files_selector
+
+# ---------------- Build Website Layout -------------------
+
+# file refresh time placeholder
+refresh_info = Div(text="""""", width=210)
+
+# create figures
+plot = figure(plot_width=1200, plot_height=800,
+              tools='pan,box_zoom,wheel_zoom,crosshair,undo,redo,reset,save',
+              toolbar_location='above', x_axis_label='Episodes',
+              x_range=Range1d(0, 10000), y_range=Range1d(0, 100000))
+plot.extra_y_ranges = {"secondary": Range1d(start=-100, end=200)}
+plot.add_layout(LinearAxis(y_range_name="secondary"), 'right')
+plot.yaxis[-1].visible = False
+
+# legend
+div = Div(text="""""")
+legend = widgetbox([div])
+
+bokeh_legend = Legend(
+    # items=[("12345678901234567890123456789012345678901234567890", [])],  # 50 letters
+    items=[("__________________________________________________", [])],  # 50 letters
+    location=(0, 0), orientation="vertical",
+    border_line_color="black",
+    label_text_font_size={'value': '9pt'},
+    margin=30
+)
+plot.add_layout(bokeh_legend, "right")
+
+# select file
+file_selection_button = Button(label="Select Files", button_type="success", width=120)
+# file_selection_button.on_click(load_files_group)
+
+files_selector_spacer = Spacer(width=10)
+
+group_selection_button = Button(label="Select Directory", button_type="primary", width=140)
+# group_selection_button.on_click(load_directory_group)
+
+unload_file_button = Button(label="Unload", button_type="danger", width=50)
+# unload_file_button.on_click(unload_file)
+
+# files selection box
+files_selector = Select(title="Files:", options=[])
+# files_selector.on_change('value', change_data_selector)
+
+# data selection box
+data_selector = MultiSelect(title="Data:", options=[], size=12)
+# data_selector.on_change('value', select_data)
+
+# toggle second axis button
+toggle_second_axis_button = Button(label="Toggle Second Axis", button_type="success")
+# toggle_second_axis_button.on_click(toggle_second_axis)
+
+# averaging slider
+averaging_slider = Slider(title="Averaging window", start=1, end=101, step=10)
+# averaging_slider.on_change('value', update_averaging)
+
+# color selector
+color_selector_title = Div(text="""Select Color:""")
+crsource = ColumnDataSource(data=dict(x=crx, y=cry, crcolor=crcolor, RGBs=crRGBs))
+color_selector = figure(x_range=(0, color_resolution), y_range=(0, 10),
+                        plot_width=300, plot_height=40,
+                        tools='tap')
+color_selector.axis.visible = False
+color_range = color_selector.rect(x='x', y='y', width=1, height=10,
+                                  color='crcolor', source=crsource)
+# crsource.on_change('selected', select_color)
+color_range.nonselection_glyph = color_range.glyph
+color_selector.toolbar.logo = None
+color_selector.toolbar_location = None
+
+episode_selector = MultiSelect(title="Episode:", options=['0', '1', '2', '3', '4'], size=1)
+
+online_toggle = Toggle(label="Online", button_type="success")
+
+# main layout of the document
+layout = row(file_selection_button, files_selector_spacer, group_selection_button, width=300)
+layout = column(layout, files_selector)
+layout = column(layout, row(refresh_info, unload_file_button))
+layout = column(layout, data_selector)
+layout = column(layout, color_selector_title)
+layout = column(layout, color_selector)
+layout = column(layout, toggle_second_axis_button)
+layout = column(layout, averaging_slider)
+layout = column(layout, episode_selector)
+layout = column(layout, online_toggle)
+layout = row(layout, plot)
+
+episodic_board_layout = layout
+
+layouts["episodic_board"] = episodic_board_layout
\ No newline at end of file
diff --git a/dashboard_components/experiment_board.py b/rl_coach/dashboard_components/experiment_board.py
similarity index 73%
rename from dashboard_components/experiment_board.py
rename to rl_coach/dashboard_components/experiment_board.py
index a37cc87..7a5ba52 100644
--- a/dashboard_components/experiment_board.py
+++ b/rl_coach/dashboard_components/experiment_board.py
@@ -1,4 +1,4 @@
-
+import copy
 import datetime
 import os
 import sys
@@ -7,17 +7,19 @@ from itertools import cycle
 from os import listdir
 from os.path import isfile, join, isdir
 
+from bokeh.layouts import row, column, Spacer, ToolbarBox
+from bokeh.models import ColumnDataSource, Range1d, LinearAxis, Legend, \
+    WheelZoomTool, CrosshairTool, ResetTool, SaveTool, Toolbar, PanTool, BoxZoomTool, \
+    Toggle
 from bokeh.models.callbacks import CustomJS
-from bokeh.layouts import row, column, Spacer
-from bokeh.models import ColumnDataSource, Range1d, LinearAxis, Legend
 from bokeh.models.widgets import RadioButtonGroup, MultiSelect, Button, Select, Slider, Div, CheckboxGroup
 from bokeh.plotting import figure
+from rl_coach.dashboard_components.globals import signals_files, x_axis_labels, x_axis_options, show_spinner, hide_spinner, \
+    dialog, FolderType, RunType, add_directory_csv_files, doc, display_boards, layouts, \
+    crcolor, crx, cry, color_resolution, crRGBs, rgb_to_hex, x_axis
+from rl_coach.dashboard_components.signals_files_group import SignalsFilesGroup
 
-from dashboard_components.globals import signals_files, x_axis_labels, x_axis_options, show_spinner, hide_spinner, \
-    x_axis, dialog, FolderType, RunType, add_directory_csv_files, doc, display_boards, layouts, \
-    crcolor, crx, cry, color_resolution, crRGBs, rgb_to_hex
-from dashboard_components.signals_files_group import SignalsFilesGroup
-from dashboard_components.signals_file import SignalsFile
+from rl_coach.dashboard_components.signals_file import SignalsFile
 
 
 def update_axis_range(name, range_placeholder):
@@ -31,7 +33,10 @@ def update_axis_range(name, range_placeholder):
         max_val = max(max_val, curr_max_val)
         min_val = min(min_val, curr_min_val)
     if min_val != float('inf'):
-        range = max_val - min_val
+        if min_val == max_val:
+            range = 5
+        else:
+            range = max_val - min_val
         range_placeholder.start = min_val - 0.1 * range
         range_placeholder.end = max_val + 0.1 * range
 
@@ -56,13 +61,29 @@ def get_all_selected_signals():
 # update legend using the legend text dictionary
 def update_legend():
     selected_signals = get_all_selected_signals()
+    max_line_length = 50
     items = []
     for signal in selected_signals:
         side_sign = "◀" if signal.axis == 'default' else "▶"
-        items.append((side_sign + " " + signal.full_name, [signal.line]))
+        signal_name = side_sign + " " + signal.full_name
+        # bokeh legend does not respect a max_width parameter so we split the text manually to lines of constant width
+        signal_name = [signal_name[n:n + max_line_length] for n in range(0, len(signal_name), max_line_length)]
+        for idx, substr in enumerate(signal_name):
+            if idx == 0:
+                lines = [signal.line]
+                if signal.show_bollinger_bands:
+                    lines.append(signal.bands)
+                items.append((substr, lines))
+            else:
+                items.append((substr, []))
+
+    if bokeh_legend.items == [] or items == [] or \
+            any([legend_item.renderers != item[1] for legend_item, item in zip(bokeh_legend.items, items)])\
+            or any([legend_item.label != item[0] for legend_item, item in zip(bokeh_legend.items, items)]):
+        bokeh_legend.items = items  # this step takes a long time because it is redrawing the plot
+
     # the visible=false => visible=true is a hack to make the legend render again
     bokeh_legend.visible = False
-    bokeh_legend.items = items  # this step takes a long time because it is redrawing the plot
     bokeh_legend.visible = True
 
 
@@ -105,10 +126,14 @@ def open_directory_dialog():
 def create_files_group_signal(files):
     global selected_file
     signals_file = SignalsFilesGroup(files, plot)
+
     signals_files[signals_file.filename] = signals_file
 
     filenames = [signals_file.filename]
-    files_selector.options += filenames
+    if files_selector.options[0] == "":
+        files_selector.options = filenames
+    else:
+        files_selector.options = files_selector.options + filenames
     files_selector.value = filenames[0]
     selected_file = signals_file
 
@@ -176,15 +201,13 @@ def get_run_type(dir_path):
 # create a signal file from the directory path according to the directory underlying structure
 def handle_dir(dir_path, run_type):
     paths = add_directory_csv_files(dir_path)
-    if run_type in [RunType.SINGLE_FOLDER_SINGLE_FILE,
-                    RunType.SINGLE_FOLDER_MULTIPLE_FILES,
+    if run_type in [RunType.SINGLE_FOLDER_MULTIPLE_FILES,
                     RunType.MULTIPLE_FOLDERS_SINGLE_FILES]:
         create_files_group_signal(paths)
+    elif run_type == RunType.SINGLE_FOLDER_SINGLE_FILE:
+        create_files_signal(paths, use_dir_name=True)
     elif run_type == RunType.MULTIPLE_FOLDERS_MULTIPLE_FILES:
         sub_dirs = [d for d in listdir(dir_path) if isdir(join(dir_path, d))]
-        # for d in sub_dirs:
-        #     paths = add_directory_csv_files(os.path.join(dir_path, d))
-        #     create_files_group_signal(paths)
         create_files_group_signal([os.path.join(dir_path, d) for d in sub_dirs])
 
 
@@ -201,6 +224,8 @@ def load_directory_group():
 
 
 def display_directory_group(directory):
+    pause_auto_update()
+
     display_boards()
     show_spinner("Loading directories group...")
 
@@ -212,20 +237,25 @@ def display_directory_group(directory):
     handle_dir(directory, get_run_type(directory))
 
     change_selected_signals_in_data_selector([""])
+
+    resume_auto_update_according_to_toggle()
     hide_spinner()
 
 
-def create_files_signal(files):
+def create_files_signal(files, use_dir_name=False):
     global selected_file
     new_signal_files = []
     for idx, file_path in enumerate(files):
-        signals_file = SignalsFile(str(file_path), plot=plot)
+        signals_file = SignalsFile(str(file_path), plot=plot, use_dir_name=use_dir_name)
         signals_files[signals_file.filename] = signals_file
         new_signal_files.append(signals_file)
 
     filenames = [f.filename for f in new_signal_files]
 
-    files_selector.options += filenames
+    if files_selector.options[0] == "":
+        files_selector.options = filenames
+    else:
+        files_selector.options = files_selector.options + filenames
     files_selector.value = filenames[0]
     selected_file = new_signal_files[0]
 
@@ -244,37 +274,55 @@ def load_files():
 
 
 def display_files(files):
+    pause_auto_update()
+
     display_boards()
     show_spinner("Loading files...")
 
     create_files_signal(files)
 
     change_selected_signals_in_data_selector([""])
+
+    resume_auto_update_according_to_toggle()
     hide_spinner()
 
 
 def unload_file():
+    global selected_file
     if selected_file is None:
         return
     selected_file.hide_all_signals()
     del signals_files[selected_file.filename]
     data_selector.options = [""]
-    filenames = cycle(files_selector.options)
-    files_selector.options.remove(selected_file.filename)
-    if len(files_selector.options) > 0:
+    filenames_list = copy.copy(files_selector.options)
+    filenames_list.remove(selected_file.filename)
+    if len(filenames_list) == 0:
+        filenames_list = [""]
+    files_selector.options = filenames_list
+    filenames = cycle(filenames_list)
+    if files_selector.options[0] != "":
         files_selector.value = next(filenames)
     else:
         files_selector.value = None
+
     update_legend()
     refresh_info.text = ""
+    if len(signals_files) == 0:
+        selected_file = None
 
 
 # reload the selected csv file
 def reload_all_files(force=False):
+    pause_auto_update()
+
     for file_to_load in signals_files.values():
         if force or file_to_load.file_was_modified_on_disk():
+            show_spinner("Updating files from the disk...")
             file_to_load.load()
-        refresh_info.text = "last update: " + str(datetime.datetime.now()).split(".")[0]
+            hide_spinner()
+        refresh_info.text = "Last Update: " + str(datetime.datetime.now()).split(".")[0]
+
+    resume_auto_update_according_to_toggle()
 
 
 # unselect the currently selected signals and then select the requested signals in the data selector
@@ -301,6 +349,10 @@ def change_data_selector(args, old, new):
         return
     show_spinner("Updating selection...")
     selected_file = signals_files[new]
+    if isinstance(selected_file, SignalsFile):
+        group_cb.disabled = True
+    elif isinstance(selected_file, SignalsFilesGroup):
+        group_cb.disabled = False
     data_selector.remove_on_change('value', select_data)
     data_selector.options = sorted(list(selected_file.signals.keys()))
     data_selector.on_change('value', select_data)
@@ -333,6 +385,9 @@ def change_x_axis(val):
 
     for file_to_load in signals_files.values():
         file_to_load.update_x_axis_index()
+        # this is needed in order to recalculate the mean of all the files
+        if isinstance(file_to_load, SignalsFilesGroup):
+            file_to_load.load()
 
     update_axis_range(x_axis[0], plot.x_range)
     hide_spinner()
@@ -354,12 +409,18 @@ def toggle_second_axis():
 
 
 def toggle_group_property(new):
+    show_spinner("Loading...")
+
     # toggle show / hide Bollinger bands
     selected_file.change_bollinger_bands_state(0 in new)
 
     # show a separate signal for each file in a group
     selected_file.show_files_separately(1 in new)
 
+    update_legend()
+
+    hide_spinner()
+
 
 # Color selection - most of these functions are taken from bokeh examples (plotting/color_sliders.py)
 def select_color(attr, old, new):
@@ -370,7 +431,23 @@ def select_color(attr, old, new):
     hide_spinner()
 
 
-doc.add_periodic_callback(reload_all_files, 20000)
+def pause_auto_update():
+    toggle_auto_update(False)
+
+
+def resume_auto_update_according_to_toggle():
+    toggle_auto_update(auto_update_toggle_button.active)
+
+
+def toggle_auto_update(new):
+    global file_update_callback
+    if new is False and file_update_callback in doc._session_callbacks:
+        doc.remove_periodic_callback(file_update_callback)
+    elif file_update_callback not in doc._session_callbacks:
+        file_update_callback = doc.add_periodic_callback(reload_all_files, 30000)
+
+
+file_update_callback = doc.add_periodic_callback(reload_all_files, 30000)
 
 # ---------------- Build Website Layout -------------------
 
@@ -379,21 +456,25 @@ refresh_info = Div(text="""""", width=210)
 
 # create figures
 plot = figure(plot_width=1200, plot_height=800,
-              tools='pan,box_zoom,wheel_zoom,crosshair,undo,redo,reset,save',
-              toolbar_location='above', x_axis_label='Episodes',
-              x_range=Range1d(0, 10000), y_range=Range1d(0, 100000))
+              # tools='pan,box_zoom,wheel_zoom,crosshair,undo,redo,reset,save',
+              toolbar_location=None, x_axis_label='Episodes',
+              x_range=Range1d(0, 10000), y_range=Range1d(0, 100000), lod_factor=1000)
 plot.extra_y_ranges = {"secondary": Range1d(start=-100, end=200)}
 plot.add_layout(LinearAxis(y_range_name="secondary"), 'right')
+toolbar = Toolbar(tools=[PanTool(), BoxZoomTool(), WheelZoomTool(), CrosshairTool(), ResetTool(), SaveTool()])
+# plot.toolbar = toolbar
+plot.add_tools(*toolbar.tools)
 plot.yaxis[-1].visible = False
 
 bokeh_legend = Legend(
-    # items=[("12345678901234567890123456789012345678901234567890", [])],  # 50 letters
-    items=[("__________________________________________________", [])],  # 50 letters
-    location=(0, 0), orientation="vertical",
+    items=[("", [])],
+    orientation="vertical",
     border_line_color="black",
     label_text_font_size={'value': '9pt'},
-    margin=30
+    click_policy='hide',
+    visible=False
 )
+bokeh_legend.label_width = 100
 plot.add_layout(bokeh_legend, "right")
 plot.y_range = Range1d(0, 100)
 plot.extra_y_ranges['secondary'] = Range1d(0, 100)
@@ -407,11 +488,17 @@ files_selector_spacer = Spacer(width=10)
 group_selection_button = Button(label="Select Directory", button_type="primary", width=140)
 group_selection_button.on_click(load_directory_group)
 
+update_files_button = Button(label="Update Files", button_type="default", width=50)
+update_files_button.on_click(reload_all_files)
+
+auto_update_toggle_button = Toggle(label="Auto Update", button_type="default", width=50, active=True)
+auto_update_toggle_button.on_click(toggle_auto_update)
+
 unload_file_button = Button(label="Unload", button_type="danger", width=50)
 unload_file_button.on_click(unload_file)
 
 # files selection box
-files_selector = Select(title="Files:", options=[])
+files_selector = Select(title="Files:", options=[""])
 files_selector.on_change('value', change_data_selector)
 
 # data selection box
@@ -419,7 +506,7 @@ data_selector = MultiSelect(title="Data:", options=[], size=12)
 data_selector.on_change('value', select_data)
 
 # x axis selection box
-x_axis_selector_title = Div(text="""X Axis:""")
+x_axis_selector_title = Div(text="""X Axis:""", height=10)
 x_axis_selector = RadioButtonGroup(labels=x_axis_options, active=0)
 x_axis_selector.on_click(change_x_axis)
 
@@ -457,7 +544,9 @@ color_selector.toolbar_location = None
 # main layout of the document
 layout = row(file_selection_button, files_selector_spacer, group_selection_button, width=300)
 layout = column(layout, files_selector)
-layout = column(layout, row(refresh_info, unload_file_button))
+layout = column(layout, row(update_files_button, Spacer(width=50), auto_update_toggle_button,
+                            Spacer(width=50), unload_file_button))
+layout = column(layout, row(refresh_info))
 layout = column(layout, data_selector)
 layout = column(layout, color_selector_title)
 layout = column(layout, color_selector)
@@ -466,7 +555,9 @@ layout = column(layout, x_axis_selector)
 layout = column(layout, group_cb)
 layout = column(layout, toggle_second_axis_button)
 layout = column(layout, averaging_slider)
-layout = row(layout, plot)
+toolbox = ToolbarBox(toolbar=toolbar, toolbar_location='above')
+panel = column(toolbox, plot)
+layout = row(layout, panel)
 
 experiment_board_layout = layout
 
diff --git a/dashboard_components/globals.py b/rl_coach/dashboard_components/globals.py
similarity index 94%
rename from dashboard_components/globals.py
rename to rl_coach/dashboard_components/globals.py
index fcad7aa..6a98cc6 100644
--- a/dashboard_components/globals.py
+++ b/rl_coach/dashboard_components/globals.py
@@ -2,7 +2,7 @@ import os
 from genericpath import isdir, isfile
 from os import listdir
 from os.path import join
-from utils import Enum
+from enum import Enum
 from bokeh.models import Div
 from bokeh.plotting import curdoc
 import wx
@@ -13,12 +13,12 @@ signals_files = {}
 selected_file = None
 x_axis = ['Episode #']
 x_axis_options = ['Episode #', 'Total steps', 'Wall-Clock Time']
-x_axis_labels = ['Episode #', 'Total steps', 'Wall-Clock Time (minutes)']
+x_axis_labels = ['Episode #', 'Total steps (per worker)', 'Wall-Clock Time (minutes)']
 current_color = 0
 
 # spinner
 root_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
-with open(os.path.join(root_dir, 'spinner.css'), 'r') as f:
+with open(os.path.join(root_dir, 'dashboard_components/spinner.css'), 'r') as f:
     spinner_style = """<style>{}</style>""".format(f.read())
     spinner_html = """<ul class="spinner"><li></li><li></li><li></li><li></li>
                       <li>
@@ -107,7 +107,7 @@ class DialogApp(wx.App):
                 return fileDialog.GetPaths()
 
     def getDirDialog(self):
-        with wx.DirDialog (None, "Choose input directory", "",
+        with wx.DirDialog(None, "Choose input directory", "",
                            style=wx.FD_OPEN | wx.FD_FILE_MUST_EXIST | wx.FD_CHANGE_DIR) as dirDialog:
             if dirDialog.ShowModal() == wx.ID_CANCEL:
                 return None  # the user changed their mind
diff --git a/dashboard_components/landing_page.py b/rl_coach/dashboard_components/landing_page.py
similarity index 83%
rename from dashboard_components/landing_page.py
rename to rl_coach/dashboard_components/landing_page.py
index 12c51f1..a315868 100644
--- a/dashboard_components/landing_page.py
+++ b/rl_coach/dashboard_components/landing_page.py
@@ -1,7 +1,8 @@
 from bokeh.layouts import row, column
 from bokeh.models.widgets import Div
-from dashboard_components.experiment_board import file_selection_button, group_selection_button
-from dashboard_components.globals import layouts
+
+from rl_coach.dashboard_components.experiment_board import file_selection_button, group_selection_button
+from rl_coach.dashboard_components.globals import layouts
 
 # title
 title = Div(text="""<h1>Coach Dashboard</h1>""")
diff --git a/dashboard_components/signals.py b/rl_coach/dashboard_components/signals.py
similarity index 95%
rename from dashboard_components/signals.py
rename to rl_coach/dashboard_components/signals.py
index 19f10b9..14f0cc8 100644
--- a/dashboard_components/signals.py
+++ b/rl_coach/dashboard_components/signals.py
@@ -3,8 +3,8 @@ import random
 import numpy as np
 from bokeh.models import ColumnDataSource
 from bokeh.palettes import Dark2
-from dashboard_components.globals import show_spinner, hide_spinner, current_color
-from utils import squeeze_list
+from rl_coach.dashboard_components.globals import show_spinner, hide_spinner, current_color
+from rl_coach.utils import squeeze_list
 
 
 class Signal:
@@ -15,6 +15,7 @@ class Signal:
         self.selected = False
         self.color = random.choice(Dark2[8])
         self.line = None
+        self.scatter = None
         self.bands = None
         self.bokeh_source = parent.bokeh_source
         self.min_val = 0
@@ -57,6 +58,7 @@ class Signal:
             self.create_bands()
         self.line = self.plot.line('index', self.mean_signal, source=self.bokeh_source,
                                    line_color=self.color, line_width=2)
+        # self.scatter = self.plot.scatter('index', self.mean_signal, source=self.bokeh_source)
         self.line.visible = True
 
     def set_selected(self, val):
diff --git a/rl_coach/dashboard_components/signals_file.py b/rl_coach/dashboard_components/signals_file.py
new file mode 100644
index 0000000..d951625
--- /dev/null
+++ b/rl_coach/dashboard_components/signals_file.py
@@ -0,0 +1,63 @@
+import os
+from os.path import basename
+
+import pandas as pd
+from pandas.errors import EmptyDataError
+
+from rl_coach.dashboard_components.signals_file_base import SignalsFileBase
+from rl_coach.dashboard_components.globals import x_axis_options
+from rl_coach.utils import break_file_path
+
+
+class SignalsFile(SignalsFileBase):
+    def __init__(self, csv_path, load=True, plot=None, use_dir_name=False):
+        super().__init__(plot)
+        self.use_dir_name = use_dir_name
+        self.full_csv_path = csv_path
+        self.dir, self.filename, _ = break_file_path(csv_path)
+
+        if use_dir_name:
+            parent_directory_path = os.path.abspath(os.path.join(os.path.dirname(csv_path), '..'))
+            if len(os.listdir(parent_directory_path)) == 1:
+                # get the parent directory name (since the current directory is the timestamp directory)
+                self.dir = parent_directory_path
+                self.filename = basename(self.dir)
+            else:
+                # get the common directory for all the experiments
+                self.dir = os.path.dirname(csv_path)
+                self.filename = "{}/{}".format(basename(parent_directory_path), basename(self.dir))
+
+        if load:
+            self.load()
+            # this helps set the correct x axis
+            self.change_averaging_window(1, force=True)
+
+    def load_csv(self, idx=None, result=None):
+        # load csv and fix sparse data.
+        # csv can be in the middle of being written so we use try - except
+        new_csv = None
+        while new_csv is None:
+            try:
+                new_csv = pd.read_csv(self.full_csv_path)
+                break
+            except EmptyDataError:
+                new_csv = None
+                continue
+
+        new_csv['Wall-Clock Time'] /= 60.
+        new_csv = new_csv.interpolate()
+        # remove signals which don't contain any values
+        for k, v in new_csv.isna().all().items():
+            if v and k not in x_axis_options:
+                del new_csv[k]
+        new_csv.fillna(value=0, inplace=True)
+
+        self.csv = new_csv
+
+        self.last_modified = os.path.getmtime(self.full_csv_path)
+
+        if idx is not None:
+            result[idx] = (self.csv, self.last_modified)
+
+    def file_was_modified_on_disk(self):
+        return self.last_modified != os.path.getmtime(self.full_csv_path)
diff --git a/dashboard_components/signals_file_base.py b/rl_coach/dashboard_components/signals_file_base.py
similarity index 92%
rename from dashboard_components/signals_file_base.py
rename to rl_coach/dashboard_components/signals_file_base.py
index 1d5628a..cc0fd96 100644
--- a/dashboard_components/signals_file_base.py
+++ b/rl_coach/dashboard_components/signals_file_base.py
@@ -1,10 +1,8 @@
-import numpy
-from bokeh.models import ColumnDataSource
-from bokeh.palettes import Dark2
-from dashboard_components.globals import x_axis, x_axis_options, show_spinner
-from dashboard_components.signals import Signal
 import numpy as np
-import copy
+from bokeh.models import ColumnDataSource
+
+from rl_coach.dashboard_components.signals import Signal
+from rl_coach.dashboard_components.globals import x_axis, x_axis_options, show_spinner
 
 
 class SignalsFileBase:
@@ -32,7 +30,7 @@ class SignalsFileBase:
         self.bokeh_source.data['index'] = self.bokeh_source.data[x_axis[0]]
 
     def toggle_y_axis(self, signal_name=None):
-        if signal_name:
+        if signal_name and signal_name in self.signals.keys():
             self.signals[signal_name].toggle_axis()
         else:
             for signal in self.signals.values():
@@ -42,16 +40,15 @@ class SignalsFileBase:
     def update_source_and_signals(self):
         # create bokeh data sources
         self.bokeh_source_orig = ColumnDataSource(self.csv)
-        self.bokeh_source_orig.data['index'] = self.bokeh_source_orig.data[x_axis[0]]
 
         if self.bokeh_source is None:
             self.bokeh_source = ColumnDataSource(self.csv)
+            self.update_x_axis_index()
         else:
+            self.update_x_axis_index()
             # smooth the data if necessary
             self.change_averaging_window(self.signals_averaging_window, force=True)
 
-        self.update_x_axis_index()
-
         # create all the signals
         if len(self.signals.keys()) == 0:
             self.signals = {}
@@ -71,7 +68,7 @@ class SignalsFileBase:
 
     def reload_data(self):
         # this function is a workaround to reload the data of all the signals
-        # if the data doesn't change, bokeh does not refreshes the line
+        # if the data doesn't change, bokeh does not refresh the line
         temp_data = self.bokeh_source.data.copy()
         for col in self.bokeh_source.data.keys():
             if not self.last_reload_data_fix:
diff --git a/dashboard_components/signals_files_group.py b/rl_coach/dashboard_components/signals_files_group.py
similarity index 58%
rename from dashboard_components/signals_files_group.py
rename to rl_coach/dashboard_components/signals_files_group.py
index 579df75..8f9e293 100644
--- a/dashboard_components/signals_files_group.py
+++ b/rl_coach/dashboard_components/signals_files_group.py
@@ -1,11 +1,12 @@
 import os
+from multiprocessing import Process, Manager
 from os.path import basename
 
 import pandas as pd
+from rl_coach.dashboard_components.globals import x_axis_options, add_directory_csv_files, show_spinner, x_axis
+from rl_coach.dashboard_components.signals_file_base import SignalsFileBase
 
-from dashboard_components.globals import x_axis_options, add_directory_csv_files, show_spinner
-from dashboard_components.signals_file import SignalsFile
-from dashboard_components.signals_file_base import SignalsFileBase
+from rl_coach.dashboard_components.signals_file import SignalsFile
 
 
 class SignalsFilesGroup(SignalsFileBase):
@@ -23,32 +24,73 @@ class SignalsFilesGroup(SignalsFileBase):
                     self.signals_files.append(SignalsFile(str(csv_path), load=False, plot=plot))
         parent_directory_path = os.path.abspath(os.path.join(os.path.dirname(csv_paths[0]), '..'))
 
-        if len(csv_paths) == 1 and len(os.listdir(parent_directory_path)) == 1:
+        if len(os.listdir(parent_directory_path)) == 1:
             # get the parent directory name (since the current directory is the timestamp directory)
-            self.dir = os.path.abspath(os.path.join(os.path.dirname(csv_paths[0]), '..'))
+            self.dir = parent_directory_path
         else:
             # get the common directory for all the experiments
-            self.dir = os.path.dirname(os.path.commonprefix(csv_paths))
+            self.dir = os.path.dirname('/'.join(os.path.commonprefix(csv_paths).split('/')[:-1]) + '/')
 
         self.filename = '{} - Group({})'.format(basename(self.dir), len(self.signals_files))
 
+        self.signal_files_need_update = False
+
         self.load()
 
     def load_csv(self):
+        global x_axis
+        # load the csv's for all workers
+        processes = []
+        results = Manager().dict()
         corrupted_files_idx = []
         for idx, signal_file in enumerate(self.signals_files):
-            signal_file.load_csv()
+            if not isinstance(signal_file, SignalsFilesGroup):
+                processes.append(Process(target=signal_file.load_csv, args=(idx, results)))
+                processes[-1].start()
+        [p.join() for p in processes]
+
+        # load csv's for SignalsFilesGroup serially for now. TODO: we should later parallelize this as well.
+        for idx, signal_file in enumerate(self.signals_files):
+            if isinstance(signal_file, SignalsFilesGroup):
+                signal_file.load_csv()
+
+        for idx, signal_file in enumerate(self.signals_files):
+            if len(list(results.keys())) > 0:
+                signal_file.csv, signal_file.last_modified = results[idx]
             if not all(option in signal_file.csv.keys() for option in x_axis_options):
                 print("Warning: {} file seems to be corrupted and does contain the necessary columns "
-                      "and will not be rendered".format(signal_file.filename))
+                          "and will not be rendered".format(signal_file.filename))
                 corrupted_files_idx.append(idx)
 
+        # remove corrupted worker files
         for file_idx in corrupted_files_idx:
             del self.signals_files[file_idx]
 
         # get the stats of all the columns
         if len(self.signals_files) > 1:
-            csv_group = pd.concat([signals_file.csv for signals_file in self.signals_files])
+            transformed_signals_files = []
+            subsampling = None
+            for idx in range(len(self.signals_files)):
+                transformed_signals_files.append(self.signals_files[idx].csv.copy(deep=True))
+
+                # change the index to be the currently selected x axis
+                transformed_signals_files[-1].index = transformed_signals_files[-1][x_axis[0]]
+
+                # remove all duplicate index rows
+                transformed_signals_files[-1] = transformed_signals_files[-1][~transformed_signals_files[-1].index.duplicated()]
+
+                # fill up missing row indices. we are going to take the mean over the group and we want to make sure
+                # the entire group has some value for every possible index.
+                num_rows = int(transformed_signals_files[-1].index.values[-1])
+                transformed_signals_files[-1] = transformed_signals_files[-1].reindex(range(num_rows))
+                transformed_signals_files[-1].interpolate(inplace=True)
+
+                # sub sample the csv to max of 5000 indices (do the same subsampling to all files)
+                if subsampling is None:
+                    subsampling = max(1, num_rows // 5000)
+                transformed_signals_files[-1] = transformed_signals_files[-1].iloc[::subsampling, :]
+
+            csv_group = pd.concat([signals_file for signals_file in transformed_signals_files])
             columns_to_remove = [s for s in csv_group.columns if '/Stdev' in s] + \
                                 [s for s in csv_group.columns if '/Min' in s] + \
                                 [s for s in csv_group.columns if '/Max' in s]
@@ -65,15 +107,15 @@ class SignalsFilesGroup(SignalsFileBase):
             self.csv_max.columns = [s + '/Max' for s in self.csv_max.columns]
 
             # get the indices from the file with the least number of indices and which is not an evaluation worker
-            file_with_min_indices = self.signals_files[0]
-            for signals_file in self.signals_files:
-                if signals_file.csv.shape[0] < file_with_min_indices.csv.shape[0] and \
-                                'Training reward' in signals_file.csv.keys():
+            file_with_min_indices = transformed_signals_files[0]
+            for signals_file in transformed_signals_files:
+                if signals_file.shape[0] < file_with_min_indices.shape[0] and \
+                                'Training reward' in signals_file.keys():
                     file_with_min_indices = signals_file
-            self.index_columns = file_with_min_indices.csv[x_axis_options]
+            self.index_columns = file_with_min_indices[x_axis_options]
 
             # concat the stats and the indices columns
-            num_rows = file_with_min_indices.csv.shape[0]
+            num_rows = file_with_min_indices.shape[0]
             self.csv = pd.concat([self.index_columns, self.csv_mean.head(num_rows), self.csv_stdev.head(num_rows),
                                   self.csv_min.head(num_rows), self.csv_max.head(num_rows)], axis=1)
 
@@ -83,42 +125,36 @@ class SignalsFilesGroup(SignalsFileBase):
                                 [s + '/Min' for s in x_axis_options] + \
                                 [s + '/Max' for s in x_axis_options]
             for col in columns_to_remove:
-                del self.csv[col]
+                if col in self.csv.keys():
+                    del self.csv[col]
         else:  # This is a group of a single file
             self.csv = self.signals_files[0].csv
 
-        # # convert wall clock time to minutes - isn't needed because the sub-signals are already scaled
-        # self.csv['Wall-Clock Time'] /= 60.
-
         # remove NaNs
         self.csv.fillna(value=0, inplace=True)  # removing this line will make bollinger bands fail
         for key in self.csv.keys():
             if 'Stdev' in key and 'Evaluation' not in key:
                 self.csv[key] = self.csv[key].fillna(value=0)
 
-        for signal_file in self.signals_files:
-            signal_file.update_source_and_signals()
+        self.signal_files_need_update = True
 
     def reload_data(self):
-        for signal_file in self.signals_files:
-            signal_file.reload_data()
         SignalsFileBase.reload_data(self)
 
     def update_x_axis_index(self):
-        for signal_file in self.signals_files:
-            signal_file.update_x_axis_index()
         SignalsFileBase.update_x_axis_index(self)
 
+        # update the x axis for the bollinger bands
+        for signal in self.signals.values():
+            if signal.has_bollinger_bands:
+                signal.set_bands_source()
+
     def toggle_y_axis(self, signal_name=None):
         for signal in self.signals.values():
             if signal.selected:
                 signal.toggle_axis()
-                for signal_file in self.signals_files:
-                    signal_file.toggle_y_axis(signal.name)
 
     def change_averaging_window(self, new_size, force=False, signals=None):
-        for signal_file in self.signals_files:
-            signal_file.change_averaging_window(new_size, force, signals)
         SignalsFileBase.change_averaging_window(self, new_size, force, signals)
 
     def set_signal_selection(self, signal_name, val):
@@ -133,6 +169,13 @@ class SignalsFilesGroup(SignalsFileBase):
 
     def show_files_separately(self, val):
         self.separate_files = val
+
+        # lazy updating of the signals of each of the workers
+        if self.separate_files and self.signal_files_need_update:
+            for signal_file in self.signals_files:
+                signal_file.update_source_and_signals()
+            self.signal_files_need_update = False
+
         for signal in self.signals.values():
             if signal.selected:
                 if val:
diff --git a/spinner.css b/rl_coach/dashboard_components/spinner.css
similarity index 99%
rename from spinner.css
rename to rl_coach/dashboard_components/spinner.css
index d8a40f1..803a870 100644
--- a/spinner.css
+++ b/rl_coach/dashboard_components/spinner.css
@@ -5,8 +5,8 @@
     width: 1em;
     height: 1em;
     position: fixed;
-    left: 50%;
-    top: 15%;
+    left: 40%;
+    top: 20%;
     z-index: 9999;
     margin: 100px auto;
     border-radius: 50%;
diff --git a/rl_coach/datasets/README.md b/rl_coach/datasets/README.md
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/datasets/doom_basic.tar.gz b/rl_coach/datasets/doom_basic.tar.gz
new file mode 100644
index 0000000..c1250e0
Binary files /dev/null and b/rl_coach/datasets/doom_basic.tar.gz differ
diff --git a/rl_coach/datasets/montezuma_revenge.tar.gz b/rl_coach/datasets/montezuma_revenge.tar.gz
new file mode 100644
index 0000000..cdf7d3d
Binary files /dev/null and b/rl_coach/datasets/montezuma_revenge.tar.gz differ
diff --git a/debug_utils.py b/rl_coach/debug_utils.py
similarity index 52%
rename from debug_utils.py
rename to rl_coach/debug_utils.py
index 11db9fc..21e2faa 100644
--- a/debug_utils.py
+++ b/rl_coach/debug_utils.py
@@ -14,11 +14,16 @@
 # limitations under the License.
 #
 
+import math
+
 import matplotlib.pyplot as plt
 import numpy as np
+from rl_coach.filters.observation.observation_stacking_filter import LazyStack
 
 
-def show_observation_stack(stack, channels_last=False):
+def show_observation_stack(stack, channels_last=True, show=True, force_num_rows=None, row_to_update=0):
+    if isinstance(stack, LazyStack):
+        stack = np.array(stack)
     if isinstance(stack, list):  # is list
         stack_size = len(stack)
     elif len(stack.shape) == 3:
@@ -27,17 +32,25 @@ def show_observation_stack(stack, channels_last=False):
         stack_size = stack.shape[1]  # ignore batch dimension
         stack = stack[0]
     else:
-        assert False, ""
+        raise ValueError("The observation stack must be a list, a numpy array or a LazyStack object")
 
     if channels_last:
         stack = np.transpose(stack, (2, 0, 1))
         stack_size = stack.shape[0]
 
+    max_cols = 10
+    if force_num_rows:
+        rows = force_num_rows
+    else:
+        rows = math.ceil(stack_size / max_cols)
+    cols = max_cols if stack_size > max_cols else stack_size
+
     for i in range(stack_size):
-        plt.subplot(1, stack_size, i + 1)
+        plt.subplot(rows, cols, row_to_update * cols + i + 1)
         plt.imshow(stack[i], cmap='gray')
 
-    plt.show()
+    if show:
+        plt.show()
 
 
 def show_diff_between_two_observations(observation1, observation2):
@@ -48,3 +61,17 @@ def show_diff_between_two_observations(observation1, observation2):
 def plot_grayscale_observation(observation):
     plt.imshow(observation, cmap='gray')
     plt.show()
+
+
+def plot_episode_states(episode_transitions, state_variable: str='state', observation_index_in_stack: int=0):
+    observations = []
+    for transition in episode_transitions:
+        observations.append(np.array(getattr(transition, state_variable)['observation'])[..., observation_index_in_stack])
+    show_observation_stack(observations, False)
+
+
+def plot_list_of_observation_stacks(observation_stacks):
+    for idx, stack in enumerate(observation_stacks):
+        show_observation_stack(stack['observation'], True, False,
+                               force_num_rows=len(observation_stacks), row_to_update=idx)
+    plt.show()
diff --git a/rl_coach/environments/CarlaSettings.ini b/rl_coach/environments/CarlaSettings.ini
new file mode 100644
index 0000000..51e0939
--- /dev/null
+++ b/rl_coach/environments/CarlaSettings.ini
@@ -0,0 +1,112 @@
+; Example of settings file for CARLA.
+;
+; This file can be loaded with the Python client to be sent to the server. It
+; defines the parameters to be used when requesting a new episode.
+;
+; Note that server specific variables are only loaded when launching the
+; simulator. Use it with `./CarlaUE4.sh -carla-settings=Path/To/This/File`.
+
+[CARLA/Server]
+; If set to false, a mock controller will be used instead of waiting for a real
+; client to connect. (Server only)
+UseNetworking=false
+; Ports to use for the server-client communication. This can be overridden by
+; the command-line switch `-world-port=N`, write and read ports will be set to
+; N+1 and N+2 respectively. (Server only)
+WorldPort=2000
+; Time-out in milliseconds for the networking operations. (Server only)
+ServerTimeOut=100000000000
+; In synchronous mode, CARLA waits every frame until the control from the client
+; is received.
+SynchronousMode=true
+; Send info about every non-player agent in the scene every frame, the
+; information is attached to the measurements message. This includes other
+; vehicles, pedestrians and traffic signs. Disabled by default to improve
+; performance.
+SendNonPlayerAgentsInfo=false
+
+[CARLA/QualitySettings]
+; Quality level of the graphics, a lower level makes the simulation run
+; considerably faster. Available: Low or Epic.
+QualityLevel=Low
+
+[CARLA/LevelSettings]
+; Path of the vehicle class to be used for the player. Leave empty for default.
+; Paths follow the pattern "/Game/Blueprints/Vehicles/Mustang/Mustang.Mustang_C"
+PlayerVehicle=
+; Number of non-player vehicles to be spawned into the level.
+NumberOfVehicles=15
+; Number of non-player pedestrians to be spawned into the level.
+NumberOfPedestrians=30
+; Index of the weather/lighting presets to use. If negative, the default presets
+; of the map will be used.
+WeatherId=1
+; Seeds for the pseudo-random number generators.
+SeedVehicles=123456789
+SeedPedestrians=123456789
+
+[CARLA/Sensor]
+; Names of the sensors to be attached to the player, comma-separated, each of
+; them should be defined in its own subsection.
+
+; Uncomment next line to add a camera called FrontCamera to the vehicle
+Sensors=FrontCamera
+
+; or uncomment next line to add a camera and a Lidar
+; Sensors=FrontCamera,MyLidar
+
+; or uncomment next line to add a regular camera and a depth camera
+; Sensors=FrontCamera,FrontCamera/Depth
+
+; Now, every camera we added needs to be defined it in its own subsection.
+[CARLA/Sensor/FrontCamera]
+; Type of the sensor. The available types are:
+;   * CAMERA                        A scene capture camera.
+;   * LIDAR_RAY_CAST                A Lidar implementation based on ray-casting.
+SensorType=CAMERA
+; Post-processing effect to be applied to this camera. Valid values:
+;   * None                  No effects applied.
+;   * SceneFinal            Post-processing present at scene (bloom, fog, etc).
+;   * Depth                 Depth map ground-truth only.
+;   * SemanticSegmentation  Semantic segmentation ground-truth only.
+PostProcessing=SceneFinal
+; Size of the captured image in pixels.
+ImageSizeX=360
+ImageSizeY=256
+; Camera (horizontal) field of view in degrees.
+FOV=90
+; Position of the camera relative to the car in meters.
+PositionX=0.20
+PositionY=0
+PositionZ=1.30
+; Rotation of the camera relative to the car in degrees.
+RotationPitch=8
+RotationRoll=0
+RotationYaw=0
+
+[CARLA/Sensor/FrontCamera/Depth]
+; The sensor can be defined in a subsection of FrontCamera so it inherits the
+; values in FrontCamera. This adds a camera similar to FrontCamera but generating
+; depth map images instead.
+PostProcessing=Depth
+
+[CARLA/Sensor/MyLidar]
+SensorType=LIDAR_RAY_CAST
+; Number of lasers.
+Channels=32
+; Measure distance in meters.
+Range=50.0
+; Points generated by all lasers per second.
+PointsPerSecond=100000
+; Lidar rotation frequency.
+RotationFrequency=10
+; Upper and lower laser angles, positive values means above horizontal line.
+UpperFOVLimit=10
+LowerFOVLimit=-30
+; Position and rotation relative to the vehicle.
+PositionX=0
+PositionY=0
+PositionZ=1.40
+RotationPitch=0
+RotationYaw=0
+RotationRoll=0
diff --git a/rl_coach/environments/README.md b/rl_coach/environments/README.md
new file mode 100644
index 0000000..cce66a0
--- /dev/null
+++ b/rl_coach/environments/README.md
@@ -0,0 +1,19 @@
+A custom environment implementation should look like this:
+
+```bash
+from coach.filters.input_filter import InputFilter
+
+class CustomFilter(InputFilter):
+  def __init__(self):
+    ...
+  def _filter(self, env_response: EnvResponse) -> EnvResponse:
+    ...
+  def _get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+    ...
+  def _get_filtered_reward_space(self, input_reward_space: RewardSpace) -> RewardSpace:
+    ...
+  def _validate_input_observation_space(self, input_observation_space: ObservationSpace):
+    ...
+  def _reset(self):
+    ...
+```
diff --git a/exploration_policies/continuous_entropy.py b/rl_coach/environments/__init__.py
similarity index 82%
rename from exploration_policies/continuous_entropy.py
rename to rl_coach/environments/__init__.py
index 8d7ee66..35227d4 100644
--- a/exploration_policies/continuous_entropy.py
+++ b/rl_coach/environments/__init__.py
@@ -14,9 +14,3 @@
 # limitations under the License.
 #
 
-import numpy as np
-from exploration_policies.exploration_policy import *
-
-
-class ContinuousEntropy(ExplorationPolicy):
-    pass
diff --git a/rl_coach/environments/carla_environment.py b/rl_coach/environments/carla_environment.py
new file mode 100644
index 0000000..3e7fbaf
--- /dev/null
+++ b/rl_coach/environments/carla_environment.py
@@ -0,0 +1,357 @@
+import random
+import sys
+from os import path, environ
+
+from rl_coach.filters.observation.observation_to_uint8_filter import ObservationToUInt8Filter
+
+from rl_coach.filters.observation.observation_rgb_to_y_filter import ObservationRGBToYFilter
+
+try:
+    if 'CARLA_ROOT' in environ:
+        sys.path.append(path.join(environ.get('CARLA_ROOT'), 'PythonClient'))
+    from carla.client import CarlaClient
+    from carla.settings import CarlaSettings
+    from carla.tcp import TCPConnectionError
+    from carla.sensor import Camera
+    from carla.client import VehicleControl
+except ImportError:
+    from rl_coach.logger import failed_imports
+    failed_imports.append("CARLA")
+
+import logging
+import subprocess
+from rl_coach.environments.environment import Environment, EnvironmentParameters, LevelSelection
+from rl_coach.spaces import BoxActionSpace, ImageObservationSpace, StateSpace, \
+    VectorObservationSpace
+from rl_coach.utils import get_open_port, force_list
+from enum import Enum
+import os
+import signal
+from typing import List, Union
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.filters.filter import InputFilter, NoOutputFilter
+from rl_coach.filters.observation.observation_rescale_to_size_filter import ObservationRescaleToSizeFilter
+from rl_coach.filters.observation.observation_stacking_filter import ObservationStackingFilter
+import numpy as np
+
+
+# enum of the available levels and their path
+class CarlaLevel(Enum):
+    TOWN1 = "/Game/Maps/Town01"
+    TOWN2 = "/Game/Maps/Town02"
+
+key_map = {
+    'BRAKE': (274,),  # down arrow
+    'GAS': (273,),  # up arrow
+    'TURN_LEFT': (276,),  # left arrow
+    'TURN_RIGHT': (275,),  # right arrow
+    'GAS_AND_TURN_LEFT': (273, 276),
+    'GAS_AND_TURN_RIGHT': (273, 275),
+    'BRAKE_AND_TURN_LEFT': (274, 276),
+    'BRAKE_AND_TURN_RIGHT': (274, 275),
+}
+
+CarlaInputFilter = InputFilter(is_a_reference_filter=True)
+CarlaInputFilter.add_observation_filter('forward_camera', 'rescaling',
+                                        ObservationRescaleToSizeFilter(ImageObservationSpace(np.array([128, 180, 3]),
+                                                                                             high=255)))
+CarlaInputFilter.add_observation_filter('forward_camera', 'to_grayscale', ObservationRGBToYFilter())
+CarlaInputFilter.add_observation_filter('forward_camera', 'to_uint8', ObservationToUInt8Filter(0, 255))
+CarlaInputFilter.add_observation_filter('forward_camera', 'stacking', ObservationStackingFilter(4))
+
+CarlaOutputFilter = NoOutputFilter()
+
+
+class CameraTypes(Enum):
+    FRONT = "forward_camera"
+    LEFT = "left_camera"
+    RIGHT = "right_camera"
+    SEGMENTATION = "segmentation"
+    DEPTH = "depth"
+    LIDAR = "lidar"
+
+
+class CarlaEnvironmentParameters(EnvironmentParameters):
+    class Quality(Enum):
+        LOW = "Low"
+        EPIC = "Epic"
+
+    def __init__(self):
+        super().__init__()
+        self.frame_skip = 3  # the frame skip affects the fps of the server directly. fps = 30 / frameskip
+        self.server_height = 512
+        self.server_width = 720
+        self.camera_height = 128
+        self.camera_width = 180
+        self.config = None #'environments/CarlaSettings.ini'  # TODO: remove the config to prevent confusion
+        self.level = 'town1'
+        self.quality = self.Quality.LOW
+        self.cameras = [CameraTypes.FRONT]
+        self.weather_id = [1]
+        self.verbose = True
+        self.episode_max_time = 100000  # miliseconds for each episode
+        self.allow_braking = False
+        self.default_input_filter = CarlaInputFilter
+        self.default_output_filter = CarlaOutputFilter
+
+    @property
+    def path(self):
+        return 'rl_coach.environments.carla_environment:CarlaEnvironment'
+
+
+class CarlaEnvironment(Environment):
+    def __init__(self, level: LevelSelection,
+                 seed: int, frame_skip: int, human_control: bool, custom_reward_threshold: Union[int, float],
+                 visualization_parameters: VisualizationParameters,
+                 server_height: int, server_width: int, camera_height: int, camera_width: int,
+                 verbose: bool, config: str, episode_max_time: int,
+                 allow_braking: bool, quality: CarlaEnvironmentParameters.Quality,
+                 cameras: List[CameraTypes], weather_id: List[int], **kwargs):
+        super().__init__(level, seed, frame_skip, human_control, custom_reward_threshold, visualization_parameters)
+
+        # server configuration
+        self.server_height = server_height
+        self.server_width = server_width
+        self.port = get_open_port()
+        self.host = 'localhost'
+        self.map = self.env_id
+
+        # client configuration
+        self.verbose = verbose
+        self.quality = quality
+        self.cameras = cameras
+        self.weather_id = weather_id
+        self.episode_max_time = episode_max_time
+        self.allow_braking = allow_braking
+        self.camera_width = camera_width
+        self.camera_height = camera_height
+
+        # state space
+        self.state_space = StateSpace({
+            "measurements": VectorObservationSpace(4, measurements_names=["forward_speed", "x", "y", "z"])
+        })
+        for camera in self.cameras:
+            self.state_space[camera.value] = ImageObservationSpace(
+                shape=np.array([self.camera_height, self.camera_width, 3]),
+                high=255)
+
+        # setup server settings
+        self.config = config
+        if self.config:
+            # load settings from file
+            with open(self.config, 'r') as fp:
+                self.settings = fp.read()
+        else:
+            # hard coded settings
+            self.settings = CarlaSettings()
+            self.settings.set(
+                SynchronousMode=True,
+                SendNonPlayerAgentsInfo=False,
+                NumberOfVehicles=15,
+                NumberOfPedestrians=30,
+                WeatherId=random.choice(force_list(self.weather_id)),
+                QualityLevel=self.quality.value)
+            self.settings.randomize_seeds()
+
+            self.settings = self._add_cameras(self.settings, self.cameras, self.camera_width, self.camera_height)
+
+        # open the server
+        self.server = self._open_server()
+
+        logging.disable(40)
+
+        # open the client
+        self.game = CarlaClient(self.host, self.port, timeout=99999999)
+        self.game.connect()
+        scene = self.game.load_settings(self.settings)
+
+        # get available start positions
+        positions = scene.player_start_spots
+        self.num_pos = len(positions)
+        self.iterator_start_positions = 0
+
+        # action space
+        self.action_space = BoxActionSpace(shape=2, low=np.array([-1, -1]), high=np.array([1, 1]))
+
+        # human control
+        if self.human_control:
+            # convert continuous action space to discrete
+            self.steering_strength = 0.5
+            self.gas_strength = 1.0
+            self.brake_strength = 0.5
+            self.action_space = PartialDiscreteActionSpaceMap(
+                target_actions=[[0., 0.],
+                                [0., -self.steering_strength],
+                                [0., self.steering_strength],
+                                [self.gas_strength, 0.],
+                                [-self.brake_strength, 0],
+                                [self.gas_strength, -self.steering_strength],
+                                [self.gas_strength, self.steering_strength],
+                                [self.brake_strength, -self.steering_strength],
+                                [self.brake_strength, self.steering_strength]],
+                target_action_space=self.action_space,
+                descriptions=['NO-OP', 'TURN_LEFT', 'TURN_RIGHT', 'GAS', 'BRAKE',
+                              'GAS_AND_TURN_LEFT', 'GAS_AND_TURN_RIGHT',
+                              'BRAKE_AND_TURN_LEFT', 'BRAKE_AND_TURN_RIGHT']
+            )
+
+            # map keyboard keys to actions
+            for idx, action in enumerate(self.action_space.descriptions):
+                for key in key_map.keys():
+                    if action == key:
+                        self.key_to_action[key_map[key]] = idx
+
+        self.num_speedup_steps = 30
+
+        # measurements
+        self.autopilot = None
+
+        # env initialization
+        self.reset_internal_state(True)
+
+        # render
+        if self.is_rendered:
+            image = self.get_rendered_image()
+            self.renderer.create_screen(image.shape[1], image.shape[0])
+
+    def _add_cameras(self, settings, cameras, camera_width, camera_height):
+        # add a front facing camera
+        if CameraTypes.FRONT in cameras:
+            camera = Camera(CameraTypes.FRONT.value)
+            camera.set_image_size(camera_width, camera_height)
+            camera.set_position(0.2, 0, 1.3)
+            camera.set_rotation(8, 0, 0)
+            settings.add_sensor(camera)
+
+        # add a left facing camera
+        if CameraTypes.LEFT in cameras:
+            camera = Camera(CameraTypes.LEFT.value)
+            camera.set_image_size(camera_width, camera_height)
+            camera.set_position(0.2, 0, 1.3)
+            camera.set_rotation(8, -30, 0)
+            settings.add_sensor(camera)
+
+        # add a right facing camera
+        if CameraTypes.RIGHT in cameras:
+            camera = Camera(CameraTypes.RIGHT.value)
+            camera.set_image_size(camera_width, camera_height)
+            camera.set_position(0.2, 0, 1.3)
+            camera.set_rotation(8, 30, 0)
+            settings.add_sensor(camera)
+
+        # add a front facing depth camera
+        if CameraTypes.DEPTH in cameras:
+            camera = Camera(CameraTypes.DEPTH.value)
+            camera.set_image_size(camera_width, camera_height)
+            camera.set_position(0.2, 0, 1.3)
+            camera.set_rotation(8, 30, 0)
+            camera.PostProcessing = 'Depth'
+            settings.add_sensor(camera)
+
+        # add a front facing semantic segmentation camera
+        if CameraTypes.SEGMENTATION in cameras:
+            camera = Camera(CameraTypes.SEGMENTATION.value)
+            camera.set_image_size(camera_width, camera_height)
+            camera.set_position(0.2, 0, 1.3)
+            camera.set_rotation(8, 30, 0)
+            camera.PostProcessing = 'SemanticSegmentation'
+            settings.add_sensor(camera)
+
+        return settings
+
+    def _open_server(self):
+        # TODO: get experiment path
+        log_path = path.join('./logs/', "CARLA_LOG_{}.txt".format(self.port))
+        with open(log_path, "wb") as out:
+            cmd = [path.join(environ.get('CARLA_ROOT'), 'CarlaUE4.sh'), self.map,
+                   "-benchmark", "-carla-server", "-fps={}".format(30 / self.frame_skip),
+                   "-world-port={}".format(self.port),
+                   "-windowed -ResX={} -ResY={}".format(self.server_width, self.server_height),
+                   "-carla-no-hud"]
+
+            if self.config:
+                cmd.append("-carla-settings={}".format(self.config))
+            p = subprocess.Popen(cmd, stdout=out, stderr=out)
+
+        return p
+
+    def _close_server(self):
+        os.killpg(os.getpgid(self.server.pid), signal.SIGKILL)
+
+    def _update_state(self):
+        # get measurements and observations
+        measurements = []
+        while type(measurements) == list:
+            measurements, sensor_data = self.game.read_data()
+        self.state = {}
+
+        for camera in self.cameras:
+            self.state[camera.value] = sensor_data[camera.value].data
+
+        self.location = [measurements.player_measurements.transform.location.x,
+                         measurements.player_measurements.transform.location.y,
+                         measurements.player_measurements.transform.location.z]
+
+        is_collision = measurements.player_measurements.collision_vehicles != 0 \
+                       or measurements.player_measurements.collision_pedestrians != 0 \
+                       or measurements.player_measurements.collision_other != 0
+
+        speed_reward = measurements.player_measurements.forward_speed - 1
+        if speed_reward > 30.:
+            speed_reward = 30.
+        self.reward = speed_reward \
+                      - (measurements.player_measurements.intersection_otherlane * 5) \
+                      - (measurements.player_measurements.intersection_offroad * 5) \
+                      - is_collision * 100 \
+                      - np.abs(self.control.steer) * 10
+
+        # update measurements
+        self.measurements = [measurements.player_measurements.forward_speed] + self.location
+        self.autopilot = measurements.player_measurements.autopilot_control
+
+        # action_p = ['%.2f' % member for member in [self.control.throttle, self.control.steer]]
+        # screen.success('REWARD: %.2f, ACTIONS: %s' % (self.reward, action_p))
+
+        if (measurements.game_timestamp >= self.episode_max_time) or is_collision:
+            # screen.success('EPISODE IS DONE. GameTime: {}, Collision: {}'.format(str(measurements.game_timestamp),
+            #                                                                      str(is_collision)))
+            self.done = True
+
+        self.state['measurements'] = self.measurements
+
+    def _take_action(self, action):
+        self.control = VehicleControl()
+        self.control.throttle = np.clip(action[0], 0, 1)
+        self.control.steer = np.clip(action[1], -1, 1)
+        self.control.brake = np.abs(np.clip(action[0], -1, 0))
+        if not self.allow_braking:
+            self.control.brake = 0
+        self.control.hand_brake = False
+        self.control.reverse = False
+
+        self.game.send_control(self.control)
+
+    def _restart_environment_episode(self, force_environment_reset=False):
+        self.iterator_start_positions += 1
+        if self.iterator_start_positions >= self.num_pos:
+            self.iterator_start_positions = 0
+
+        try:
+            self.game.start_episode(self.iterator_start_positions)
+        except:
+            self.game.connect()
+            self.game.start_episode(self.iterator_start_positions)
+
+        # start the game with some initial speed
+        for i in range(self.num_speedup_steps):
+            self._take_action([1.0, 0])
+
+    def get_rendered_image(self) -> np.ndarray:
+        """
+        Return a numpy array containing the image that will be rendered to the screen.
+        This can be different from the observation. For example, mujoco's observation is a measurements vector.
+        :return: numpy array containing the image that will be rendered to the screen
+        """
+        image = [self.state[camera.value] for camera in self.cameras]
+        image = np.vstack(image)
+        return image
diff --git a/rl_coach/environments/control_suite_environment.py b/rl_coach/environments/control_suite_environment.py
new file mode 100644
index 0000000..811835e
--- /dev/null
+++ b/rl_coach/environments/control_suite_environment.py
@@ -0,0 +1,162 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+
+
+import random
+from enum import Enum
+from typing import Union
+
+import numpy as np
+
+try:
+    from dm_control import suite
+    from dm_control.suite.wrappers import pixels
+except ImportError:
+    from rl_coach.logger import failed_imports
+    failed_imports.append("DeepMind Control Suite")
+
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.environment import Environment, EnvironmentParameters, LevelSelection
+from rl_coach.filters.filter import NoInputFilter, NoOutputFilter
+from rl_coach.spaces import BoxActionSpace, ImageObservationSpace, VectorObservationSpace, StateSpace
+
+
+class ObservationType(Enum):
+    Measurements = 1
+    Image = 2
+    Image_and_Measurements = 3
+
+
+# Parameters
+class ControlSuiteEnvironmentParameters(EnvironmentParameters):
+    def __init__(self):
+        super().__init__()
+        self.observation_type = ObservationType.Measurements
+        self.default_input_filter = ControlSuiteInputFilter
+        self.default_output_filter = ControlSuiteOutputFilter
+
+    @property
+    def path(self):
+        return 'rl_coach.environments.control_suite_environment:ControlSuiteEnvironment'
+
+
+"""
+ControlSuite Environment Components
+"""
+ControlSuiteInputFilter = NoInputFilter()
+ControlSuiteOutputFilter = NoOutputFilter()
+
+control_suite_envs = {':'.join(env): ':'.join(env) for env in suite.BENCHMARKING}
+
+
+# Environment
+class ControlSuiteEnvironment(Environment):
+    def __init__(self, level: LevelSelection, frame_skip: int, visualization_parameters: VisualizationParameters,
+                 seed: Union[None, int]=None, human_control: bool=False,
+                 observation_type: ObservationType=ObservationType.Measurements,
+                 custom_reward_threshold: Union[int, float]=None, **kwargs):
+        super().__init__(level, seed, frame_skip, human_control, custom_reward_threshold, visualization_parameters)
+
+        self.observation_type = observation_type
+
+        # load and initialize environment
+        domain_name, task_name = self.env_id.split(":")
+        self.env = suite.load(domain_name=domain_name, task_name=task_name)
+
+        if observation_type != ObservationType.Measurements:
+            self.env = pixels.Wrapper(self.env, pixels_only=observation_type == ObservationType.Image)
+
+        # seed
+        if self.seed is not None:
+            np.random.seed(self.seed)
+            random.seed(self.seed)
+
+        self.state_space = StateSpace({})
+
+        # image observations
+        if observation_type != ObservationType.Measurements:
+            self.state_space['pixels'] = ImageObservationSpace(shape=self.env.observation_spec()['pixels'].shape,
+                                                               high=255)
+
+        # measurements observations
+        if observation_type != ObservationType.Image:
+            measurements_space_size = 0
+            measurements_names = []
+            for observation_space_name, observation_space in self.env.observation_spec().items():
+                if len(observation_space.shape) == 0:
+                    measurements_space_size += 1
+                    measurements_names.append(observation_space_name)
+                elif len(observation_space.shape) == 1:
+                    measurements_space_size += observation_space.shape[0]
+                    measurements_names.extend(["{}_{}".format(observation_space_name, i) for i in
+                                               range(observation_space.shape[0])])
+            self.state_space['measurements'] = VectorObservationSpace(shape=measurements_space_size,
+                                                                      measurements_names=measurements_names)
+
+        # actions
+        self.action_space = BoxActionSpace(
+            shape=self.env.action_spec().shape[0],
+            low=self.env.action_spec().minimum,
+            high=self.env.action_spec().maximum
+        )
+
+        # initialize the state by getting a new state from the environment
+        self.reset_internal_state(True)
+
+        # render
+        if self.is_rendered:
+            image = self.get_rendered_image()
+            scale = 1
+            if self.human_control:
+                scale = 2
+            if not self.native_rendering:
+                self.renderer.create_screen(image.shape[1]*scale, image.shape[0]*scale)
+
+    def _update_state(self):
+        self.state = {}
+
+        if self.observation_type != ObservationType.Measurements:
+            self.pixels = self.last_result.observation['pixels']
+            self.state['pixels'] = self.pixels
+
+        if self.observation_type != ObservationType.Image:
+            self.measurements = np.array([])
+            for sub_observation in self.last_result.observation.values():
+                if isinstance(sub_observation, np.ndarray) and len(sub_observation.shape) == 1:
+                    self.measurements = np.concatenate((self.measurements, sub_observation))
+                else:
+                    self.measurements = np.concatenate((self.measurements, np.array([sub_observation])))
+            self.state['measurements'] = self.measurements
+
+        self.reward = self.last_result.reward if self.last_result.reward is not None else 0
+
+        self.done = self.last_result.last()
+
+    def _take_action(self, action):
+        if type(self.action_space) == BoxActionSpace:
+            action = self.action_space.clip_action_to_space(action)
+
+        self.last_result = self.env.step(action)
+
+    def _restart_environment_episode(self, force_environment_reset=False):
+        self.last_result = self.env.reset()
+
+    def _render(self):
+        pass
+
+    def get_rendered_image(self):
+        return self.env.physics.render(camera_id=0)
diff --git a/rl_coach/environments/doom/D2_navigation.cfg b/rl_coach/environments/doom/D2_navigation.cfg
new file mode 100644
index 0000000..1fb444a
--- /dev/null
+++ b/rl_coach/environments/doom/D2_navigation.cfg
@@ -0,0 +1,39 @@
+# Lines starting with # are treated as comments (or with whitespaces+#).
+# It doesn't matter if you use capital letters or not.
+# It doesn't matter if you use underscore or camel notation for keys, e.g. episode_timeout is the same as episodeTimeout.
+
+doom_scenario_path = D2_navigation.wad
+doom_map = map01
+
+# Rewards
+
+# Each step is good for you!
+living_reward = 1
+# And death is not!
+death_penalty = 0
+
+# Rendering options
+screen_resolution = RES_160X120
+screen_format = GRAY8
+render_hud = false
+render_crosshair = false
+render_weapon = false
+render_decals = false
+render_particles = false
+window_visible = false
+
+# make episodes finish after 2100 actions (tics)
+episode_timeout = 2100
+
+# Available buttons
+available_buttons = 
+	{ 
+		TURN_LEFT 
+		TURN_RIGHT 
+		MOVE_FORWARD 
+	}
+
+# Game variables that will be in the state
+available_game_variables = { HEALTH }
+
+mode = PLAYER
diff --git a/rl_coach/environments/doom/D2_navigation.wad b/rl_coach/environments/doom/D2_navigation.wad
new file mode 100644
index 0000000..b4327d6
Binary files /dev/null and b/rl_coach/environments/doom/D2_navigation.wad differ
diff --git a/rl_coach/environments/doom/D3_battle.cfg b/rl_coach/environments/doom/D3_battle.cfg
new file mode 100644
index 0000000..e825d9f
--- /dev/null
+++ b/rl_coach/environments/doom/D3_battle.cfg
@@ -0,0 +1,44 @@
+# Lines starting with # are treated as comments (or with whitespaces+#).
+# It doesn't matter if you use capital letters or not.
+# It doesn't matter if you use underscore or camel notation for keys, e.g. episode_timeout is the same as episodeTimeout.
+
+# modifty these to point to your vizdoom binary and freedoom2.wad
+doom_scenario_path = D3_battle.wad
+doom_map = map01
+
+# Rewards
+
+living_reward = 0
+death_penalty = 0
+
+# Rendering options
+screen_resolution = RES_320X240
+screen_format = CRCGCB
+render_hud = false
+render_crosshair = true
+render_weapon = true
+render_decals = false
+render_particles = false
+window_visible = false
+
+# make episodes finish after 2100 actions (tics)
+episode_timeout = 2100
+
+# Available buttons
+available_buttons = 
+    { 
+        MOVE_FORWARD
+        MOVE_BACKWARD
+        MOVE_RIGHT
+        MOVE_LEFT       
+        TURN_LEFT
+        TURN_RIGHT
+        ATTACK
+        SPEED
+    }
+
+# Game variables that will be in the state
+available_game_variables = {AMMO2 HEALTH USER2}
+
+mode = PLAYER
+doom_skill = 2
diff --git a/rl_coach/environments/doom/D3_battle.wad b/rl_coach/environments/doom/D3_battle.wad
new file mode 100644
index 0000000..de7877e
Binary files /dev/null and b/rl_coach/environments/doom/D3_battle.wad differ
diff --git a/rl_coach/environments/doom_environment.py b/rl_coach/environments/doom_environment.py
new file mode 100644
index 0000000..822b5f1
--- /dev/null
+++ b/rl_coach/environments/doom_environment.py
@@ -0,0 +1,229 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+try:
+    import vizdoom
+except ImportError:
+    from rl_coach.logger import failed_imports
+    failed_imports.append("ViZDoom")
+
+import os
+from enum import Enum
+from os import path, environ
+from typing import Union, List
+
+import numpy as np
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.environment import Environment, EnvironmentParameters, LevelSelection
+from rl_coach.filters.action.full_discrete_action_space_map import FullDiscreteActionSpaceMap
+from rl_coach.filters.filter import InputFilter, OutputFilter
+from rl_coach.filters.observation.observation_rescale_to_size_filter import ObservationRescaleToSizeFilter
+from rl_coach.filters.observation.observation_stacking_filter import ObservationStackingFilter
+from rl_coach.filters.observation.observation_to_uint8_filter import ObservationToUInt8Filter
+from rl_coach.spaces import MultiSelectActionSpace, ImageObservationSpace, \
+    VectorObservationSpace, StateSpace
+
+from rl_coach.filters.observation.observation_rgb_to_y_filter import ObservationRGBToYFilter
+
+
+# enum of the available levels and their path
+class DoomLevel(Enum):
+    BASIC = "basic.cfg"
+    DEFEND = "defend_the_center.cfg"
+    DEATHMATCH = "deathmatch.cfg"
+    MY_WAY_HOME = "my_way_home.cfg"
+    TAKE_COVER = "take_cover.cfg"
+    HEALTH_GATHERING = "health_gathering.cfg"
+    HEALTH_GATHERING_SUPREME_COACH_LOCAL = "D2_navigation.cfg"  # from https://github.com/IntelVCL/DirectFuturePrediction/tree/master/maps
+    DEFEND_THE_LINE = "defend_the_line.cfg"
+    DEADLY_CORRIDOR = "deadly_corridor.cfg"
+    BATTLE_COACH_LOCAL = "D3_battle.cfg"  # from https://github.com/IntelVCL/DirectFuturePrediction/tree/master/maps
+
+key_map = {
+    'NO-OP': 96,  # `
+    'ATTACK': 13,  # enter
+    'CROUCH': 306,  # ctrl
+    'DROP_SELECTED_ITEM': ord("t"),
+    'DROP_SELECTED_WEAPON': ord("t"),
+    'JUMP': 32,  # spacebar
+    'LAND': ord("l"),
+    'LOOK_DOWN': 274,  # down arrow
+    'LOOK_UP': 273,  # up arrow
+    'MOVE_BACKWARD': ord("s"),
+    'MOVE_DOWN': ord("s"),
+    'MOVE_FORWARD': ord("w"),
+    'MOVE_LEFT': 276,
+    'MOVE_RIGHT': 275,
+    'MOVE_UP': ord("w"),
+    'RELOAD': ord("r"),
+    'SELECT_NEXT_WEAPON': ord("q"),
+    'SELECT_PREV_WEAPON': ord("e"),
+    'SELECT_WEAPON0': ord("0"),
+    'SELECT_WEAPON1': ord("1"),
+    'SELECT_WEAPON2': ord("2"),
+    'SELECT_WEAPON3': ord("3"),
+    'SELECT_WEAPON4': ord("4"),
+    'SELECT_WEAPON5': ord("5"),
+    'SELECT_WEAPON6': ord("6"),
+    'SELECT_WEAPON7': ord("7"),
+    'SELECT_WEAPON8': ord("8"),
+    'SELECT_WEAPON9': ord("9"),
+    'SPEED': 304,  # shift
+    'STRAFE': 9,  # tab
+    'TURN180': ord("u"),
+    'TURN_LEFT': ord("a"),  # left arrow
+    'TURN_RIGHT': ord("d"),  # right arrow
+    'USE': ord("f"),
+}
+
+
+DoomInputFilter = InputFilter(is_a_reference_filter=True)
+DoomInputFilter.add_observation_filter('observation', 'rescaling',
+                                       ObservationRescaleToSizeFilter(ImageObservationSpace(np.array([60, 76, 3]),
+                                                                                            high=255)))
+DoomInputFilter.add_observation_filter('observation', 'to_grayscale', ObservationRGBToYFilter())
+DoomInputFilter.add_observation_filter('observation', 'to_uint8', ObservationToUInt8Filter(0, 255))
+DoomInputFilter.add_observation_filter('observation', 'stacking', ObservationStackingFilter(3))
+
+
+DoomOutputFilter = OutputFilter(is_a_reference_filter=True)
+DoomOutputFilter.add_action_filter('to_discrete', FullDiscreteActionSpaceMap())
+
+
+class DoomEnvironmentParameters(EnvironmentParameters):
+    def __init__(self):
+        super().__init__()
+        self.default_input_filter = DoomInputFilter
+        self.default_output_filter = DoomOutputFilter
+        self.cameras = [DoomEnvironment.CameraTypes.OBSERVATION]
+
+    @property
+    def path(self):
+        return 'rl_coach.environments.doom_environment:DoomEnvironment'
+
+
+class DoomEnvironment(Environment):
+    class CameraTypes(Enum):
+        OBSERVATION = ("observation", "screen_buffer")
+        DEPTH = ("depth", "depth_buffer")
+        LABELS = ("labels", "labels_buffer")
+        MAP = ("map", "automap_buffer")
+
+    def __init__(self, level: LevelSelection, seed: int, frame_skip: int, human_control: bool,
+                 custom_reward_threshold: Union[int, float], visualization_parameters: VisualizationParameters,
+                 cameras: List[CameraTypes], **kwargs):
+        super().__init__(level, seed, frame_skip, human_control, custom_reward_threshold, visualization_parameters)
+
+        self.cameras = cameras
+
+        # load the emulator with the required level
+        self.level = DoomLevel[level.upper()]
+        local_scenarios_path = path.join(os.path.dirname(os.path.realpath(__file__)), 'doom')
+        self.scenarios_dir = local_scenarios_path if 'COACH_LOCAL' in level \
+            else path.join(environ.get('VIZDOOM_ROOT'), 'scenarios')
+
+        self.game = vizdoom.DoomGame()
+        self.game.load_config(path.join(self.scenarios_dir, self.level.value))
+        self.game.set_window_visible(False)
+        self.game.add_game_args("+vid_forcesurface 1")
+
+        self.wait_for_explicit_human_action = True
+        if self.human_control:
+            self.game.set_screen_resolution(vizdoom.ScreenResolution.RES_640X480)
+        elif self.is_rendered:
+            self.game.set_screen_resolution(vizdoom.ScreenResolution.RES_320X240)
+        else:
+            # lower resolution since we actually take only 76x60 and we don't need to render
+            self.game.set_screen_resolution(vizdoom.ScreenResolution.RES_160X120)
+
+        self.game.set_render_hud(False)
+        self.game.set_render_crosshair(False)
+        self.game.set_render_decals(False)
+        self.game.set_render_particles(False)
+        for camera in self.cameras:
+            if hasattr(self.game, 'set_{}_enabled'.format(camera.value[1])):
+                getattr(self.game, 'set_{}_enabled'.format(camera.value[1]))(True)
+        self.game.init()
+
+        # actions
+        actions_description = ['NO-OP']
+        actions_description += [str(action).split(".")[1] for action in self.game.get_available_buttons()]
+        actions_description = actions_description[::-1]
+        self.action_space = MultiSelectActionSpace(self.game.get_available_buttons_size(),
+                                                   max_simultaneous_selected_actions=1,
+                                                   descriptions=actions_description,
+                                                   allow_no_action_to_be_selected=True)
+
+        # human control
+        if self.human_control:
+            # TODO: add this to the action space
+            # map keyboard keys to actions
+            for idx, action in enumerate(self.action_space.descriptions):
+                if action in key_map.keys():
+                    self.key_to_action[(key_map[action],)] = idx
+
+        # states
+        self.state_space = StateSpace({
+            "measurements": VectorObservationSpace(self.game.get_state().game_variables.shape[0],
+                                                   measurements_names=[str(m) for m in
+                                                                       self.game.get_available_game_variables()])
+        })
+        for camera in self.cameras:
+            self.state_space[camera.value[0]] = ImageObservationSpace(
+                shape=np.array([self.game.get_screen_height(), self.game.get_screen_width(), 3]),
+                high=255)
+
+        # seed
+        if seed is not None:
+            self.game.set_seed(seed)
+        self.reset_internal_state()
+
+        # render
+        if self.is_rendered:
+            image = self.get_rendered_image()
+            self.renderer.create_screen(image.shape[1], image.shape[0])
+
+    def _update_state(self):
+        # extract all data from the current state
+        state = self.game.get_state()
+        if state is not None and state.screen_buffer is not None:
+            self.measurements = state.game_variables
+            self.state = {'measurements': self.measurements}
+            for camera in self.cameras:
+                observation = getattr(state, camera.value[1])
+                if len(observation.shape) == 3:
+                    self.state[camera.value[0]] = np.transpose(observation, (1, 2, 0))
+                elif len(observation.shape) == 2:
+                    self.state[camera.value[0]] = np.repeat(np.expand_dims(observation, -1), 3, axis=-1)
+
+        self.reward = self.game.get_last_reward()
+        self.done = self.game.is_episode_finished()
+
+    def _take_action(self, action):
+        self.game.make_action(list(action), self.frame_skip)
+
+    def _restart_environment_episode(self, force_environment_reset=False):
+        self.game.new_episode()
+
+    def get_rendered_image(self) -> np.ndarray:
+        """
+        Return a numpy array containing the image that will be rendered to the screen.
+        This can be different from the observation. For example, mujoco's observation is a measurements vector.
+        :return: numpy array containing the image that will be rendered to the screen
+        """
+        image = [self.state[camera.value[0]] for camera in self.cameras]
+        image = np.vstack(image)
+        return image
diff --git a/rl_coach/environments/environment.py b/rl_coach/environments/environment.py
new file mode 100644
index 0000000..2e5d2fc
--- /dev/null
+++ b/rl_coach/environments/environment.py
@@ -0,0 +1,540 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import operator
+import time
+from collections import OrderedDict
+from typing import Union, List, Tuple, Dict
+
+import numpy as np
+from rl_coach.base_parameters import Parameters
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.core_types import GoalType, ActionType, EnvResponse, RunPhase
+from rl_coach.renderer import Renderer
+from rl_coach.spaces import ActionSpace, ObservationSpace, DiscreteActionSpace, RewardSpace, StateSpace
+from rl_coach.utils import squeeze_list, force_list
+
+from rl_coach import logger
+from rl_coach.environments.environment_interface import EnvironmentInterface
+from rl_coach.logger import screen
+
+
+class LevelSelection(object):
+    def __init__(self, level: str):
+        self.selected_level = level
+
+    def select(self, level: str):
+        self.selected_level = level
+
+    def __str__(self):
+        if self.selected_level is None:
+            logger.screen.error("No level has been selected. Please select a level using the -lvl command line flag, "
+                                "or change the level in the preset.", crash=True)
+        return self.selected_level
+
+
+class SingleLevelSelection(LevelSelection):
+    def __init__(self, levels: Union[str, List[str], Dict[str, str]]):
+        super().__init__(None)
+        self.levels = levels
+        if isinstance(levels, list):
+            self.levels = {level: level for level in levels}
+        if isinstance(levels, str):
+            self.levels = {levels: levels}
+
+    def __str__(self):
+        if self.selected_level is None:
+            logger.screen.error("No level has been selected. Please select a level using the -lvl command line flag, "
+                                "or change the level in the preset. \nThe available levels are: \n{}"
+                                .format(', '.join(self.levels.keys())), crash=True)
+        if self.selected_level not in self.levels.keys():
+            logger.screen.error("The selected level ({}) is not part of the available levels ({})"
+                                .format(self.selected_level, ', '.join(self.levels.keys())), crash=True)
+        return self.levels[self.selected_level]
+
+
+# class SingleLevelPerPhase(LevelSelection):
+#     def __init__(self, levels: Dict[RunPhase, str]):
+#         super().__init__(None)
+#         self.levels = levels
+#
+#     def __str__(self):
+#         super().__str__()
+#         if self.selected_level not in self.levels.keys():
+#             logger.screen.error("The selected level ({}) is not part of the available levels ({})"
+#                                 .format(self.selected_level, self.levels.keys()), crash=True)
+#         return self.levels[self.selected_level]
+
+
+class CustomWrapper(object):
+    def __init__(self, environment):
+        super().__init__()
+        self.environment = environment
+
+    def __getattr__(self, attr):
+        if attr in self.__dict__:
+            return self.__dict__[attr]
+        else:
+            return getattr(self.environment, attr, False)
+
+
+class EnvironmentParameters(Parameters):
+    def __init__(self):
+        super().__init__()
+        self.level = None
+        self.frame_skip = 4
+        self.seed = None
+        self.human_control = False
+        self.custom_reward_threshold = None
+        self.default_input_filter = None
+        self.default_output_filter = None
+
+    @property
+    def path(self):
+        return 'rl_coach.environments.environment:Environment'
+
+
+class Environment(EnvironmentInterface):
+    def __init__(self, level: LevelSelection, seed: int, frame_skip: int, human_control: bool,
+                 custom_reward_threshold: Union[int, float], visualization_parameters: VisualizationParameters,
+                 **kwargs):
+        """
+        :param level: The environment level. Each environment can have multiple levels
+        :param seed: a seed for the random number generator of the environment
+        :param frame_skip: number of frames to skip (while repeating the same action) between each two agent directives
+        :param human_control: human should control the environment
+        :param visualization_parameters: a blob of parameters used for visualization of the environment
+        :param **kwargs: as the class is instantiated by EnvironmentParameters, this is used to support having
+                         additional arguments which will be ignored by this class, but might be used by others
+        """
+        super().__init__()
+
+        # env initialization
+
+        self.game = []
+
+        self.state = {}
+        self.observation = None
+        self.goal = None
+        self.reward = 0
+        self.done = False
+        self.info = {}
+        self._last_env_response = None
+        self.last_action = 0
+        self.episode_idx = 0
+        self.total_steps_counter = 0
+        self.current_episode_steps_counter = 0
+        self.last_episode_time = time.time()
+        self.key_to_action = {}
+        self.last_episode_images = []
+
+        # rewards
+        self.total_reward_in_current_episode = 0
+        self.max_reward_achieved = -np.inf
+        self.reward_success_threshold = custom_reward_threshold
+
+        # spaces
+        self.state_space = self._state_space = None
+        self.goal_space = self._goal_space = None
+        self.action_space = self._action_space = None
+        self.reward_space = RewardSpace(1, reward_success_threshold=self.reward_success_threshold)  # TODO: add a getter and setter
+
+        self.env_id = str(level)
+        self.seed = seed
+        self.frame_skip = frame_skip
+
+        # human interaction and visualization
+        self.human_control = human_control
+        self.wait_for_explicit_human_action = False
+        self.is_rendered = visualization_parameters.render or self.human_control
+        self.native_rendering = visualization_parameters.native_rendering or self.human_control
+        self.visualization_parameters = visualization_parameters
+        if not self.native_rendering:
+            self.renderer = Renderer()
+
+    @property
+    def action_space(self) -> Union[List[ActionSpace], ActionSpace]:
+        """
+        Get the action space of the environment
+        :return: the action space
+        """
+        return self._action_space
+
+    @action_space.setter
+    def action_space(self, val: Union[List[ActionSpace], ActionSpace]):
+        """
+        Set the action space of the environment
+        :return: None
+        """
+        self._action_space = val
+
+    @property
+    def state_space(self) -> Union[List[StateSpace], StateSpace]:
+        """
+        Get the state space of the environment
+        :return: the observation space
+        """
+        return self._state_space
+
+    @state_space.setter
+    def state_space(self, val: Union[List[StateSpace], StateSpace]):
+        """
+        Set the state space of the environment
+        :return: None
+        """
+        self._state_space = val
+
+    @property
+    def goal_space(self) -> Union[List[ObservationSpace], ObservationSpace]:
+        """
+        Get the state space of the environment
+        :return: the observation space
+        """
+        return self._goal_space
+
+    @goal_space.setter
+    def goal_space(self, val: Union[List[ObservationSpace], ObservationSpace]):
+        """
+        Set the goal space of the environment
+        :return: None
+        """
+        self._goal_space = val
+
+    def get_action_from_user(self) -> ActionType:
+        """
+        Get an action from the user keyboard
+        :return: action index
+        """
+        if self.wait_for_explicit_human_action:
+            while len(self.renderer.pressed_keys) == 0:
+                self.renderer.get_events()
+
+        if self.key_to_action == {}:
+            # the keys are the numbers on the keyboard corresponding to the action index
+            if len(self.renderer.pressed_keys) > 0:
+                action_idx = self.renderer.pressed_keys[0] - ord("1")
+                if 0 <= action_idx < self.action_space.shape[0]:
+                    return action_idx
+        else:
+            # the keys are mapped through the environment to more intuitive keyboard keys
+            # key = tuple(self.renderer.pressed_keys)
+            # for key in self.renderer.pressed_keys:
+            for env_keys in self.key_to_action.keys():
+                if set(env_keys) == set(self.renderer.pressed_keys):
+                    return self.action_space.actions[self.key_to_action[env_keys]]
+
+        # return the default action 0 so that the environment will continue running
+        return self.action_space.default_action
+
+    @property
+    def last_env_response(self) -> Union[List[EnvResponse], EnvResponse]:
+        """
+        Get the last environment response
+        :return: a dictionary that contains the state, reward, etc.
+        """
+        return squeeze_list(self._last_env_response)
+
+    @last_env_response.setter
+    def last_env_response(self, val: Union[List[EnvResponse], EnvResponse]):
+        """
+        Set the last environment response
+        :param val: the last environment response
+        """
+        self._last_env_response = force_list(val)
+
+    def step(self, action: ActionType) -> EnvResponse:
+        """
+        Make a single step in the environment using the given action
+        :param action: an action to use for stepping the environment. Should follow the definition of the action space.
+        :return: the environment response as returned in get_last_env_response
+        """
+        action = self.action_space.clip_action_to_space(action)
+        if self.action_space and not self.action_space.val_matches_space_definition(action):
+            raise ValueError("The given action does not match the action space definition. "
+                             "Action = {}, action space definition = {}".format(action, self.action_space))
+
+        # store the last agent action done and allow passing None actions to repeat the previously done action
+        if action is None:
+            action = self.last_action
+        self.last_action = action
+        if self.visualization_parameters.add_rendered_image_to_env_response:
+            current_rendered_image = self.get_rendered_image()
+
+        self.current_episode_steps_counter += 1
+        if self.phase != RunPhase.UNDEFINED:
+            self.total_steps_counter += 1
+
+        # act
+        self._take_action(action)
+
+        # observe
+        self._update_state()
+
+        if self.is_rendered:
+            self.render()
+
+        self.total_reward_in_current_episode += self.reward
+
+        if self.visualization_parameters.add_rendered_image_to_env_response:
+            self.info['image'] = current_rendered_image
+
+        self.last_env_response = \
+            EnvResponse(
+                reward=self.reward,
+                next_state=self.state,
+                goal=self.goal,
+                game_over=self.done,
+                info=self.info
+            )
+
+        # store observations for video / gif dumping
+        if self.should_dump_video_of_the_current_episode(episode_terminated=False) and \
+            (self.visualization_parameters.dump_mp4 or self.visualization_parameters.dump_gifs):
+            self.last_episode_images.append(self.get_rendered_image())
+
+        return self.last_env_response
+
+    def render(self) -> None:
+        """
+        Call the environment function for rendering to the screen
+        """
+        if self.native_rendering:
+            self._render()
+        else:
+            self.renderer.render_image(self.get_rendered_image())
+
+    def reset_internal_state(self, force_environment_reset=False) -> EnvResponse:
+        """
+        Reset the environment and all the variable of the wrapper
+        :param force_environment_reset: forces environment reset even when the game did not end
+        :return: A dictionary containing the observation, reward, done flag, action and measurements
+        """
+
+        self.dump_video_of_last_episode_if_needed()
+        self._restart_environment_episode(force_environment_reset)
+        self.last_episode_time = time.time()
+
+        if self.current_episode_steps_counter > 0 and self.phase != RunPhase.UNDEFINED:
+            self.episode_idx += 1
+
+        self.done = False
+        self.total_reward_in_current_episode = self.reward = 0.0
+        self.last_action = 0
+        self.current_episode_steps_counter = 0
+        self.last_episode_images = []
+        self._update_state()
+
+        # render before the preprocessing of the observation, so that the image will be in its original quality
+        if self.is_rendered:
+            self.render()
+
+        self.last_env_response = \
+            EnvResponse(
+                reward=self.reward,
+                next_state=self.state,
+                goal=self.goal,
+                game_over=self.done,
+                info=self.info
+            )
+
+        return self.last_env_response
+
+    def get_random_action(self) -> ActionType:
+        """
+        Returns an action picked uniformly from the available actions
+        :return: a numpy array with a random action
+        """
+        return self.action_space.sample()
+
+    def get_available_keys(self) -> List[Tuple[str, ActionType]]:
+        """
+        Return a list of tuples mapping between action names and the keyboard key that triggers them
+        :return: a list of tuples mapping between action names and the keyboard key that triggers them
+        """
+        available_keys = []
+        if self.key_to_action != {}:
+            for key, idx in sorted(self.key_to_action.items(), key=operator.itemgetter(1)):
+                if key != ():
+                    key_names = [self.renderer.get_key_names([k])[0] for k in key]
+                    available_keys.append((self.action_space.descriptions[idx], ' + '.join(key_names)))
+        elif type(self.action_space) == DiscreteActionSpace:
+            for action in range(self.action_space.shape):
+                available_keys.append(("Action {}".format(action + 1), action + 1))
+        return available_keys
+
+    def get_goal(self) -> GoalType:
+        """
+        Get the current goal that the agents needs to achieve in the environment
+        :return: The goal
+        """
+        return self.goal
+
+    def set_goal(self, goal: GoalType) -> None:
+        """
+        Set the current goal that the agent needs to achieve in the environment
+        :param goal: the goal that needs to be achieved
+        :return: None
+        """
+        self.goal = goal
+
+    def should_dump_video_of_the_current_episode(self, episode_terminated=False):
+        if self.visualization_parameters.video_dump_methods:
+            for video_dump_method in force_list(self.visualization_parameters.video_dump_methods):
+                if not video_dump_method.should_dump(episode_terminated, **self.__dict__):
+                    return False
+            return True
+        return False
+
+    def dump_video_of_last_episode_if_needed(self):
+        if self.visualization_parameters.video_dump_methods and self.last_episode_images != []:
+            if self.should_dump_video_of_the_current_episode(episode_terminated=True):
+                self.dump_video_of_last_episode()
+
+    def dump_video_of_last_episode(self):
+        frame_skipping = max(1, int(5 / self.frame_skip))
+        file_name = 'episode-{}_score-{}'.format(self.episode_idx, self.total_reward_in_current_episode)
+        fps = 10
+        if self.visualization_parameters.dump_gifs:
+            logger.create_gif(self.last_episode_images[::frame_skipping], name=file_name, fps=fps)
+        if self.visualization_parameters.dump_mp4:
+            logger.create_mp4(self.last_episode_images[::frame_skipping], name=file_name, fps=fps)
+
+    def log_to_screen(self):
+        # log to screen
+        log = OrderedDict()
+        log["Episode"] = self.episode_idx
+        log["Total reward"] = np.round(self.total_reward_in_current_episode, 2)
+        log["Steps"] = self.total_steps_counter
+        screen.log_dict(log, prefix=self.phase.value)
+
+    # The following functions define the interaction with the environment.
+    # Any new environment that inherits the Environment class should use these signatures.
+    # Some of these functions are optional - please read their description for more details.
+
+    def _take_action(self, action_idx: ActionType) -> None:
+        """
+        An environment dependent function that sends an action to the simulator.
+        :param action_idx: the action to perform on the environment
+        :return: None
+        """
+        raise NotImplementedError("")
+
+    def _update_state(self) -> None:
+        """
+        Updates the state from the environment.
+        Should update self.observation, self.reward, self.done, self.measurements and self.info
+        :return: None
+        """
+        raise NotImplementedError("")
+
+    def _restart_environment_episode(self, force_environment_reset=False) -> None:
+        """
+        Restarts the simulator episode
+        :param force_environment_reset: Force the environment to reset even if the episode is not done yet.
+        :return: None
+        """
+        raise NotImplementedError("")
+
+    def _render(self) -> None:
+        """
+        Renders the environment using the native simulator renderer
+        :return: None
+        """
+        pass
+
+    def get_rendered_image(self) -> np.ndarray:
+        """
+        Return a numpy array containing the image that will be rendered to the screen.
+        This can be different from the observation. For example, mujoco's observation is a measurements vector.
+        :return: numpy array containing the image that will be rendered to the screen
+        """
+        return np.transpose(self.state['observation'], [1, 2, 0])
+
+
+"""
+Video Dumping Methods
+"""
+
+
+class VideoDumpMethod(object):
+    """
+    Method used to decide when to dump videos
+    """
+    def should_dump(self, episode_terminated=False, **kwargs):
+        raise NotImplementedError("")
+
+
+class AlwaysDumpMethod(VideoDumpMethod):
+    """
+    Dump video for every episode
+    """
+    def __init__(self):
+        super().__init__()
+
+    def should_dump(self, episode_terminated=False, **kwargs):
+        return True
+
+
+class MaxDumpMethod(VideoDumpMethod):
+    """
+    Dump video every time a new max total reward has been achieved
+    """
+    def __init__(self):
+        super().__init__()
+        self.max_reward_achieved = -np.inf
+
+    def should_dump(self, episode_terminated=False, **kwargs):
+        # if the episode has not finished yet we want to be prepared for dumping a video
+        if not episode_terminated:
+            return True
+        if kwargs['total_reward_in_current_episode'] > self.max_reward_achieved:
+            self.max_reward_achieved = kwargs['total_reward_in_current_episode']
+            return True
+        else:
+            return False
+
+
+class EveryNEpisodesDumpMethod(object):
+    """
+    Dump videos once in every N episodes
+    """
+    def __init__(self, num_episodes_between_dumps: int):
+        super().__init__()
+        self.num_episodes_between_dumps = num_episodes_between_dumps
+        self.last_dumped_episode = 0
+        if num_episodes_between_dumps < 1:
+            raise ValueError("the number of episodes between dumps should be a positive number")
+
+    def should_dump(self, episode_terminated=False, **kwargs):
+        if kwargs['episode_idx'] >= self.last_dumped_episode + self.num_episodes_between_dumps - 1:
+            self.last_dumped_episode = kwargs['episode_idx']
+            return True
+        else:
+            return False
+
+
+class SelectedPhaseOnlyDumpMethod(object):
+    """
+    Dump videos when the phase of the environment matches a predefined phase
+    """
+    def __init__(self, run_phases: Union[RunPhase, List[RunPhase]]):
+        self.run_phases = force_list(run_phases)
+
+    def should_dump(self, episode_terminated=False, **kwargs):
+        if kwargs['_phase'] in self.run_phases:
+            return True
+        else:
+            return False
diff --git a/rl_coach/environments/environment_group.py b/rl_coach/environments/environment_group.py
new file mode 100644
index 0000000..ef9574b
--- /dev/null
+++ b/rl_coach/environments/environment_group.py
@@ -0,0 +1,149 @@
+
+########################################################################################################################
+####### Currently we are ignoring more complex cases including EnvironmentGroups - DO NOT USE THIS FILE ****************
+########################################################################################################################
+
+
+
+
+# #
+# # Copyright (c) 2017 Intel Corporation
+# #
+# # Licensed under the Apache License, Version 2.0 (the "License");
+# # you may not use this file except in compliance with the License.
+# # You may obtain a copy of the License at
+# #
+# #      http://www.apache.org/licenses/LICENSE-2.0
+# #
+# # Unless required by applicable law or agreed to in writing, software
+# # distributed under the License is distributed on an "AS IS" BASIS,
+# # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# # See the License for the specific language governing permissions and
+# # limitations under the License.
+# #
+#
+# from typing import Union, List, Dict
+# import numpy as np
+# from environments import create_environment
+# from environments.environment import Environment
+# from environments.environment_interface import EnvironmentInterface, ActionType, ActionSpace
+# from core_types import GoalType, Transition
+#
+#
+# class EnvironmentGroup(EnvironmentInterface):
+#     """
+#     An EnvironmentGroup is a group of different environments.
+#     In the simple case, it will contain a single environment. But it can also contain multiple environments,
+#     where the agent can then act on them as a batch, such that the prediction of the action is more efficient.
+#     """
+#     def __init__(self, environments_parameters: List[Environment]):
+#         self.environments_parameters = environments_parameters
+#         self.environments = []
+#         self.action_space = []
+#         self.outgoing_control = []
+#         self._last_env_response = []
+#
+#     @property
+#     def action_space(self) -> Union[List[ActionSpace], ActionSpace]:
+#         """
+#         Get the action space of the environment
+#         :return: the action space
+#         """
+#         return self.action_space
+#
+#     @action_space.setter
+#     def action_space(self, val: Union[List[ActionSpace], ActionSpace]):
+#         """
+#         Set the action space of the environment
+#         :return: None
+#         """
+#         self.action_space = val
+#
+#     @property
+#     def phase(self) -> RunPhase:
+#         """
+#         Get the phase of the environments group
+#         :return: the current phase
+#         """
+#         return self.phase
+#
+#     @phase.setter
+#     def phase(self, val: RunPhase):
+#         """
+#         Change the phase of each one of the environments in the group
+#         :param val: the new phase
+#         :return: None
+#         """
+#         self.phase = val
+#         call_method_for_all(self.environments, 'phase', val)
+#
+#     def _create_environments(self):
+#         """
+#         Create the environments using the given parameters and update the environments list
+#         :return: None
+#         """
+#         for environment_parameters in self.environments_parameters:
+#             environment = create_environment(environment_parameters)
+#             self.action_space = self.action_space.append(environment.action_space)
+#             self.environments.append(environment)
+#
+#    @property
+#    def last_env_response(self) -> Union[List[Transition], Transition]:
+#        """
+#        Get the last environment response
+#        :return: a dictionary that contains the state, reward, etc.
+#        """
+#        return squeeze_list(self._last_env_response)
+#
+#    @last_env_response.setter
+#    def last_env_response(self, val: Union[List[Transition], Transition]):
+#        """
+#        Set the last environment response
+#        :param val: the last environment response
+#        """
+#        self._last_env_response = force_list(val)
+#
+#     def step(self, actions: Union[List[ActionType], ActionType]) -> List[Transition]:
+#         """
+#         Act in all the environments in the group.
+#         :param actions: can be either a single action if there is a single environment in the group, or a list of
+#                         actions in case there are multiple environments in the group. Each action can be an action index
+#                         or a numpy array representing a continuous action for example.
+#         :return: The responses from all the environments in the group
+#         """
+#
+#         actions = force_list(actions)
+#         if len(actions) != len(self.environments):
+#             raise ValueError("The number of actions does not match the number of environments in the group")
+#
+#         result = []
+#         for environment, action in zip(self.environments, actions):
+#             result.append(environment.step(action))
+#
+#         self.last_env_response = result
+#
+#         return result
+#
+#     def reset(self, force_environment_reset: bool=False) -> List[Transition]:
+#         """
+#         Reset all the environments in the group
+#         :param force_environment_reset: force the reset of each one of the environments
+#         :return: a list of the environments responses
+#         """
+#         return call_method_for_all(self.environments, 'reset', force_environment_reset)
+#
+#     def get_random_action(self) -> List[ActionType]:
+#        """
+#        Get a list of random action that can be applied on the environments in the group
+#        :return: a list of random actions
+#        """
+#         return call_method_for_all(self.environments, 'get_random_action')
+#
+#     def set_goal(self, goal: GoalType) -> None:
+#         """
+#         Set the goal of each one of the environments in the group to be the given goal
+#         :param goal: a goal vector
+#         :return: None
+#         """
+#         # TODO: maybe enable setting multiple goals?
+#         call_method_for_all(self.environments, 'set_goal', goal)
diff --git a/rl_coach/environments/environment_interface.py b/rl_coach/environments/environment_interface.py
new file mode 100644
index 0000000..f92d1ca
--- /dev/null
+++ b/rl_coach/environments/environment_interface.py
@@ -0,0 +1,76 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union, Dict
+
+from rl_coach.spaces import ActionSpace
+
+from rl_coach.core_types import ActionType, EnvResponse, RunPhase
+
+
+class EnvironmentInterface(object):
+    def __init__(self):
+        self._phase = RunPhase.UNDEFINED
+
+    @property
+    def phase(self) -> RunPhase:
+        """
+        Get the phase of the environment
+        :return: the current phase
+        """
+        return self._phase
+
+    @phase.setter
+    def phase(self, val: RunPhase):
+        """
+        Change the phase of the environment
+        :param val: the new phase
+        :return: None
+        """
+        self._phase = val
+
+    @property
+    def action_space(self) -> Union[Dict[str, ActionSpace], ActionSpace]:
+        """
+        Get the action space of the environment (or of each of the agents wrapped in this environment.
+        i.e. in the LevelManager case")
+        :return: the action space
+        """
+        raise NotImplementedError("")
+
+    def get_random_action(self) -> ActionType:
+        """
+        Get a random action from the environment action space
+        :return: An action that follows the definition of the action space.
+        """
+        raise NotImplementedError("")
+
+    def step(self, action: ActionType) -> Union[None, EnvResponse]:
+        """
+        Make a single step in the environment using the given action
+        :param action: an action to use for stepping the environment. Should follow the definition of the action space.
+        :return: the environment response as returned in get_last_env_response or None for LevelManager
+        """
+        raise NotImplementedError("")
+
+    def reset_internal_state(self, force_environment_reset: bool=False) -> Union[None, EnvResponse]:
+        """
+        Reset the environment episode
+        :param force_environment_reset: in some cases, resetting the environment can be suppressed by the environment
+                                        itself. This flag allows force the reset.
+        :return: the environment response as returned in get_last_env_response or None for LevelManager
+        """
+        raise NotImplementedError("")
diff --git a/rl_coach/environments/gym_environment.py b/rl_coach/environments/gym_environment.py
new file mode 100644
index 0000000..5e1f0c4
--- /dev/null
+++ b/rl_coach/environments/gym_environment.py
@@ -0,0 +1,454 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import gym
+import numpy as np
+import scipy.ndimage
+
+from rl_coach.utils import lower_under_to_upper, short_dynamic_import
+
+try:
+    import roboschool
+    from OpenGL import GL
+except ImportError:
+    from rl_coach.logger import failed_imports
+    failed_imports.append("RoboSchool")
+
+try:
+    from rl_coach.gym_extensions.continuous import mujoco
+except:
+    from rl_coach.logger import failed_imports
+    failed_imports.append("GymExtensions")
+
+try:
+    import pybullet_envs
+except ImportError:
+    from rl_coach.logger import failed_imports
+    failed_imports.append("PyBullet")
+
+from typing import Dict, Any, Union
+from rl_coach.core_types import RunPhase
+from rl_coach.environments.environment import Environment, EnvironmentParameters, LevelSelection
+from rl_coach.spaces import DiscreteActionSpace, BoxActionSpace, ImageObservationSpace, VectorObservationSpace, \
+    StateSpace, RewardSpace
+from rl_coach.filters.filter import NoInputFilter, NoOutputFilter
+from rl_coach.filters.reward.reward_clipping_filter import RewardClippingFilter
+from rl_coach.filters.observation.observation_rescale_to_size_filter import ObservationRescaleToSizeFilter
+from rl_coach.filters.observation.observation_stacking_filter import ObservationStackingFilter
+from rl_coach.filters.observation.observation_rgb_to_y_filter import ObservationRGBToYFilter
+from rl_coach.filters.observation.observation_to_uint8_filter import ObservationToUInt8Filter
+from rl_coach.filters.filter import InputFilter
+import random
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.logger import screen
+
+
+# Parameters
+
+class GymEnvironmentParameters(EnvironmentParameters):
+    def __init__(self):
+        super().__init__()
+        self.random_initialization_steps = 0
+        self.max_over_num_frames = 1
+        self.additional_simulator_parameters = None
+
+    @property
+    def path(self):
+        return 'rl_coach.environments.gym_environment:GymEnvironment'
+
+
+"""
+Roboschool Environment Components
+"""
+RoboSchoolInputFilters = NoInputFilter()
+RoboSchoolOutputFilters = NoOutputFilter()
+
+
+class Roboschool(GymEnvironmentParameters):
+    def __init__(self):
+        super().__init__()
+        self.frame_skip = 1
+        self.default_input_filter = RoboSchoolInputFilters
+        self.default_output_filter = RoboSchoolOutputFilters
+
+
+gym_roboschool_envs = ['inverted_pendulum', 'inverted_pendulum_swingup', 'inverted_double_pendulum', 'reacher',
+                       'hopper', 'walker2d', 'half_cheetah', 'ant', 'humanoid', 'humanoid_flagrun',
+                       'humanoid_flagrun_harder', 'pong']
+roboschool_v0 = {e: "{}".format(lower_under_to_upper(e) + '-v0') for e in gym_roboschool_envs}
+
+"""
+Mujoco Environment Components
+"""
+MujocoInputFilter = NoInputFilter()
+MujocoOutputFilter = NoOutputFilter()
+
+
+class Mujoco(GymEnvironmentParameters):
+    def __init__(self):
+        super().__init__()
+        self.frame_skip = 1
+        self.default_input_filter = MujocoInputFilter
+        self.default_output_filter = MujocoOutputFilter
+
+
+gym_mujoco_envs = ['inverted_pendulum', 'inverted_double_pendulum', 'reacher', 'hopper', 'walker2d', 'half_cheetah',
+                   'ant', 'swimmer', 'humanoid', 'humanoid_standup', 'pusher', 'thrower', 'striker']
+
+mujoco_v2 = {e: "{}".format(lower_under_to_upper(e) + '-v2') for e in gym_mujoco_envs}
+mujoco_v2['walker2d'] = 'Walker2d-v2'
+
+gym_fetch_envs = ['reach', 'slide', 'push', 'pick_and_place']
+fetch_v1 = {e: "{}".format('Fetch' + lower_under_to_upper(e) + '-v1') for e in gym_fetch_envs}
+
+"""
+Bullet Environment Components
+"""
+BulletInputFilter = NoInputFilter()
+BulletOutputFilter = NoOutputFilter()
+
+
+class Bullet(GymEnvironmentParameters):
+    def __init__(self):
+        super().__init__()
+        self.frame_skip = 1
+        self.default_input_filter = BulletInputFilter
+        self.default_output_filter = BulletOutputFilter
+
+
+"""
+Atari Environment Components
+"""
+
+AtariInputFilter = InputFilter(is_a_reference_filter=True)
+AtariInputFilter.add_reward_filter('clipping', RewardClippingFilter(-1.0, 1.0))
+AtariInputFilter.add_observation_filter('observation', 'rescaling',
+                                        ObservationRescaleToSizeFilter(ImageObservationSpace(np.array([84, 84, 3]),
+                                                                                             high=255)))
+AtariInputFilter.add_observation_filter('observation', 'to_grayscale', ObservationRGBToYFilter())
+AtariInputFilter.add_observation_filter('observation', 'to_uint8', ObservationToUInt8Filter(0, 255))
+AtariInputFilter.add_observation_filter('observation', 'stacking', ObservationStackingFilter(4))
+AtariOutputFilter = NoOutputFilter()
+
+
+class Atari(GymEnvironmentParameters):
+    def __init__(self):
+        super().__init__()
+        self.frame_skip = 4
+        self.max_over_num_frames = 2
+        self.random_initialization_steps = 30
+        self.default_input_filter = AtariInputFilter
+        self.default_output_filter = AtariOutputFilter
+
+
+gym_atari_envs = ['air_raid', 'alien', 'amidar', 'assault', 'asterix', 'asteroids', 'atlantis',
+                  'bank_heist', 'battle_zone', 'beam_rider', 'berzerk', 'bowling', 'boxing', 'breakout', 'carnival',
+                  'centipede', 'chopper_command', 'crazy_climber', 'demon_attack', 'double_dunk',
+                  'elevator_action', 'enduro', 'fishing_derby', 'freeway', 'frostbite', 'gopher', 'gravitar',
+                  'hero', 'ice_hockey', 'jamesbond', 'journey_escape', 'kangaroo', 'krull', 'kung_fu_master',
+                  'montezuma_revenge', 'ms_pacman', 'name_this_game', 'phoenix', 'pitfall', 'pong', 'pooyan',
+                  'private_eye', 'qbert', 'riverraid', 'road_runner', 'robotank', 'seaquest', 'skiing',
+                  'solaris', 'space_invaders', 'star_gunner', 'tennis', 'time_pilot', 'tutankham', 'up_n_down',
+                  'venture', 'video_pinball', 'wizard_of_wor', 'yars_revenge', 'zaxxon']
+atari_deterministic_v4 = {e: "{}".format(lower_under_to_upper(e) + 'Deterministic-v4') for e in gym_atari_envs}
+atari_no_frameskip_v4 = {e: "{}".format(lower_under_to_upper(e) + 'NoFrameskip-v4') for e in gym_atari_envs}
+
+
+class MaxOverFramesAndFrameskipEnvWrapper(gym.Wrapper):
+    def __init__(self, env, frameskip=4, max_over_num_frames=2):
+        super().__init__(env)
+        self.max_over_num_frames = max_over_num_frames
+        self.observations_stack = []
+        self.frameskip = frameskip
+        self.first_frame_to_max_over = self.frameskip - self.max_over_num_frames
+
+    def reset(self):
+        return self.env.reset()
+
+    def step(self, action):
+        total_reward = 0.0
+        done = None
+        info = None
+        self.observations_stack = []
+        for i in range(self.frameskip):
+            observation, reward, done, info = self.env.step(action)
+            if i >= self.first_frame_to_max_over:
+                self.observations_stack.append(observation)
+            total_reward += reward
+            if done:
+                # deal with last state in episode
+                if not self.observations_stack:
+                    self.observations_stack.append(observation)
+                break
+
+        max_over_frames_observation = np.max(self.observations_stack, axis=0)
+
+        return max_over_frames_observation, total_reward, done, info
+
+
+# Environment
+class GymEnvironment(Environment):
+    def __init__(self, level: LevelSelection, frame_skip: int, visualization_parameters: VisualizationParameters,
+                 additional_simulator_parameters: Dict[str, Any] = None, seed: Union[None, int]=None,
+                 human_control: bool=False, custom_reward_threshold: Union[int, float]=None,
+                 random_initialization_steps: int=1, max_over_num_frames: int=1, **kwargs):
+        super().__init__(level, seed, frame_skip, human_control, custom_reward_threshold,
+                         visualization_parameters)
+
+        self.random_initialization_steps = random_initialization_steps
+        self.max_over_num_frames = max_over_num_frames
+        self.additional_simulator_parameters = additional_simulator_parameters
+
+        # hide warnings
+        gym.logger.set_level(40)
+
+        """
+        load and initialize environment
+        environment ids can be defined in 3 ways:
+        1. Native gym environments like BreakoutDeterministic-v0 for example
+        2. Custom gym environments written and installed as python packages.
+           This environments should have a python module with a class inheriting gym.Env, implementing the
+           relevant functions (_reset, _step, _render) and defining the observation and action space
+           For example: my_environment_package:MyEnvironmentClass will run an environment defined in the
+           MyEnvironmentClass class
+        3. Custom gym environments written as an independent module which is not installed.
+           This environments should have a python module with a class inheriting gym.Env, implementing the
+           relevant functions (_reset, _step, _render) and defining the observation and action space.
+           For example: path_to_my_environment.sub_directory.my_module:MyEnvironmentClass will run an
+           environment defined in the MyEnvironmentClass class which is located in the module in the relative path
+           path_to_my_environment.sub_directory.my_module
+        """
+        if ':' in self.env_id:
+            # custom environments
+            if '/' in self.env_id or '.' in self.env_id:
+                # environment in a an absolute path module written as a unix path or in a relative path module
+                # written as a python import path
+                env_class = short_dynamic_import(self.env_id)
+            else:
+                # environment in a python package
+                env_class = gym.envs.registration.load(self.env_id)
+
+            # instantiate the environment
+            if self.additional_simulator_parameters:
+                self.env = env_class(**self.additional_simulator_parameters)
+            else:
+                self.env = env_class()
+        else:
+            self.env = gym.make(self.env_id)
+
+        # for classic control we want to use the native renderer because otherwise we will get 2 renderer windows
+        environment_to_always_use_with_native_rendering = ['classic_control', 'mujoco', 'robotics']
+        self.native_rendering = self.native_rendering or \
+                                any([env in str(self.env.unwrapped.__class__)
+                                     for env in environment_to_always_use_with_native_rendering])
+        if self.native_rendering:
+            if hasattr(self, 'renderer'):
+                self.renderer.close()
+
+        # seed
+        if self.seed is not None:
+            self.env.seed(self.seed)
+            np.random.seed(self.seed)
+            random.seed(self.seed)
+
+        # frame skip and max between consecutive frames
+        self.is_robotics_env = 'robotics' in str(self.env.unwrapped.__class__)
+        self.is_mujoco_env = 'mujoco' in str(self.env.unwrapped.__class__)
+        self.is_atari_env = 'Atari' in str(self.env.unwrapped.__class__)
+        self.timelimit_env_wrapper = self.env
+        if self.is_atari_env:
+            self.env.unwrapped.frameskip = 1  # this accesses the atari env that is wrapped with a timelimit wrapper env
+            if self.env_id == "SpaceInvadersDeterministic-v4" and self.frame_skip == 4:
+                screen.warning("Warning: The frame-skip for Space Invaders was automatically updated from 4 to 3. "
+                               "This is following the DQN paper where it was noticed that a frame-skip of 3 makes the "
+                               "laser rays disappear. To force frame-skip of 4, please use SpaceInvadersNoFrameskip-v4.")
+                self.frame_skip = 3
+            self.env = MaxOverFramesAndFrameskipEnvWrapper(self.env,
+                                                           frameskip=self.frame_skip,
+                                                           max_over_num_frames=self.max_over_num_frames)
+        else:
+            self.env.unwrapped.frameskip = self.frame_skip
+
+        self.state_space = StateSpace({})
+
+        # observations
+        if not isinstance(self.env.observation_space, gym.spaces.dict_space.Dict):
+            state_space = {'observation': self.env.observation_space}
+        else:
+            state_space = self.env.observation_space.spaces
+
+        for observation_space_name, observation_space in state_space.items():
+            if len(observation_space.shape) == 3 and observation_space.shape[-1] == 3:
+                # we assume gym has image observations which are RGB and where their values are within 0-255
+                self.state_space[observation_space_name] = ImageObservationSpace(
+                    shape=np.array(observation_space.shape),
+                    high=255,
+                    channels_axis=-1
+                )
+            else:
+                self.state_space[observation_space_name] = VectorObservationSpace(
+                    shape=observation_space.shape[0],
+                    low=observation_space.low,
+                    high=observation_space.high
+                )
+        if 'desired_goal' in state_space.keys():
+            self.goal_space = self.state_space['desired_goal']
+
+        # actions
+        if type(self.env.action_space) == gym.spaces.box.Box:
+            self.action_space = BoxActionSpace(
+                shape=self.env.action_space.shape,
+                low=self.env.action_space.low,
+                high=self.env.action_space.high
+            )
+        elif type(self.env.action_space) == gym.spaces.discrete.Discrete:
+            actions_description = []
+            if hasattr(self.env.unwrapped, 'get_action_meanings'):
+                actions_description = self.env.unwrapped.get_action_meanings()
+            self.action_space = DiscreteActionSpace(
+                num_actions=self.env.action_space.n,
+                descriptions=actions_description
+            )
+
+        if self.human_control:
+            # TODO: add this to the action space
+            # map keyboard keys to actions
+            self.key_to_action = {}
+            if hasattr(self.env.unwrapped, 'get_keys_to_action'):
+                self.key_to_action = self.env.unwrapped.get_keys_to_action()
+
+        # initialize the state by getting a new state from the environment
+        self.reset_internal_state(True)
+
+        # render
+        if self.is_rendered:
+            image = self.get_rendered_image()
+            scale = 1
+            if self.human_control:
+                scale = 2
+            if not self.native_rendering:
+                self.renderer.create_screen(image.shape[1]*scale, image.shape[0]*scale)
+
+        # measurements
+        if self.env.spec is not None:
+            self.timestep_limit = self.env.spec.timestep_limit
+        else:
+            self.timestep_limit = None
+
+        # the info is only updated after the first step
+        self.state = self.step(self.action_space.default_action).next_state
+        self.state_space['measurements'] = VectorObservationSpace(shape=len(self.info.keys()))
+
+        if self.env.spec and custom_reward_threshold is None:
+                self.reward_success_threshold = self.env.spec.reward_threshold
+                self.reward_space = RewardSpace(1, reward_success_threshold=self.reward_success_threshold)
+
+    def _wrap_state(self, state):
+        if not isinstance(self.env.observation_space, gym.spaces.Dict):
+            return {'observation': state}
+        return state
+
+    def _update_state(self):
+        if self.is_atari_env and hasattr(self, 'current_ale_lives') \
+                and self.current_ale_lives != self.env.unwrapped.ale.lives():
+            if self.phase == RunPhase.TRAIN or self.phase == RunPhase.HEATUP:
+                # signal termination for life loss
+                self.done = True
+            elif self.phase == RunPhase.TEST and not self.done:
+                # the episode is not terminated in evaluation, but we need to press fire again
+                self._press_fire()
+            self._update_ale_lives()
+        # TODO: update the measurements
+        if self.state and "desired_goal" in self.state.keys():
+            self.goal = self.state['desired_goal']
+
+    def _take_action(self, action):
+        if type(self.action_space) == BoxActionSpace:
+            action = self.action_space.clip_action_to_space(action)
+
+        self.state, self.reward, self.done, self.info = self.env.step(action)
+        self.state = self._wrap_state(self.state)
+
+    def _random_noop(self):
+        # simulate a random initial environment state by stepping for a random number of times between 0 and 30
+        step_count = 0
+        random_initialization_steps = random.randint(0, self.random_initialization_steps)
+        while self.action_space is not None and (self.state is None or step_count < random_initialization_steps):
+            step_count += 1
+            self.step(self.action_space.default_action)
+
+    def _press_fire(self):
+        fire_action = 1
+        if self.is_atari_env and self.env.unwrapped.get_action_meanings()[fire_action] == 'FIRE':
+            self.current_ale_lives = self.env.unwrapped.ale.lives()
+            self.step(fire_action)
+            if self.done:
+                self.reset_internal_state()
+
+    def _update_ale_lives(self):
+        if self.is_atari_env:
+            self.current_ale_lives = self.env.unwrapped.ale.lives()
+
+    def _restart_environment_episode(self, force_environment_reset=False):
+        # prevent reset of environment if there are ale lives left
+        if (self.is_atari_env and self.env.unwrapped.ale.lives() > 0) \
+                and not force_environment_reset and not self.timelimit_env_wrapper._past_limit():
+            self.step(self.action_space.default_action)
+        else:
+            self.state = self.env.reset()
+            self.state = self._wrap_state(self.state)
+            self._update_ale_lives()
+
+        if self.is_atari_env:
+            self._random_noop()
+            self._press_fire()
+
+        # initialize the number of lives
+        self._update_ale_lives()
+
+    def _set_mujoco_camera(self, camera_idx: int):
+        """
+        This function can be used to set the camera for rendering the mujoco simulator
+        :param camera_idx: The index of the camera to use. Should be defined in the model
+        :return: None
+        """
+        if self.env.unwrapped.viewer.cam.fixedcamid != camera_idx and self.env.unwrapped.viewer._ncam > camera_idx:
+            from mujoco_py.generated import const
+            self.env.unwrapped.viewer.cam.type = const.CAMERA_FIXED
+            self.env.unwrapped.viewer.cam.fixedcamid = camera_idx
+
+    def _get_robotics_image(self):
+        self.env.render()
+        image = self.env.unwrapped._get_viewer().read_pixels(1600, 900, depth=False)[::-1, :, :]
+        image = scipy.misc.imresize(image, (270, 480, 3))
+        return image
+
+    def _render(self):
+        self.env.render(mode='human')
+        # required for setting up a fixed camera for mujoco
+        if self.is_mujoco_env:
+            self._set_mujoco_camera(0)
+
+    def get_rendered_image(self):
+        if self.is_robotics_env:
+            # necessary for fetch since the rendered image is cropped to an irrelevant part of the simulator
+            image = self._get_robotics_image()
+        else:
+            image = self.env.render(mode='rgb_array')
+        # required for setting up a fixed camera for mujoco
+        if self.is_mujoco_env:
+            self._set_mujoco_camera(0)
+        return image
diff --git a/rl_coach/environments/mujoco/__init__.py b/rl_coach/environments/mujoco/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/environments/mujoco/common/__init__.py b/rl_coach/environments/mujoco/common/__init__.py
new file mode 100644
index 0000000..0c63647
--- /dev/null
+++ b/rl_coach/environments/mujoco/common/__init__.py
@@ -0,0 +1,38 @@
+# Copyright 2017 The dm_control Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+"""Functions to manage the common assets for domains."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+from dm_control.utils import resources
+
+_SUITE_DIR = os.path.dirname(os.path.dirname(__file__))
+_FILENAMES = [
+    "common/materials.xml",
+    "common/skybox.xml",
+    "common/visual.xml",
+]
+
+ASSETS = {filename: resources.GetResource(os.path.join(_SUITE_DIR, filename))
+          for filename in _FILENAMES}
+
+
+def read_model(model_filename):
+  """Reads a model XML file and returns its contents as a string."""
+  return resources.GetResource(os.path.join(_SUITE_DIR, model_filename))
diff --git a/rl_coach/environments/mujoco/common/materials.xml b/rl_coach/environments/mujoco/common/materials.xml
new file mode 100644
index 0000000..cae6635
--- /dev/null
+++ b/rl_coach/environments/mujoco/common/materials.xml
@@ -0,0 +1,22 @@
+<!--
+Common textures, colors and materials to be used throughout this suite. Some
+materials such as xxx_highlight are activated on occurence of certain events,
+for example receiving a positive reward.
+-->
+<mujoco>
+  <asset>
+    <texture name="grid" type="2d" builtin="checker" rgb1=".1 .2 .3" rgb2=".2 .3 .4" width="300" height="300" mark="edge" markrgb=".2 .3 .4"/>
+    <material name="grid" texture="grid" texrepeat="1 1" texuniform="true" reflectance=".2"/>
+    <material name="self" rgba=".7 .5 .3 1"/>
+    <material name="self_default" rgba=".7 .5 .3 1"/>
+    <material name="self_highlight" rgba="0 .5 .3 1"/>
+    <material name="effector" rgba=".7 .4 .2 1"/>
+    <material name="effector_default" rgba=".7 .4 .2 1"/>
+    <material name="effector_highlight" rgba="0 .5 .3 1"/>
+    <material name="decoration" rgba=".3 .5 .7 1"/>
+    <material name="eye" rgba="0 .2 1 1"/>
+    <material name="target" rgba=".6 .3 .3 1"/>
+    <material name="target_default" rgba=".6 .3 .3 1"/>
+    <material name="target_highlight" rgba=".6 .3 .3 .4"/>
+  </asset>
+</mujoco>
diff --git a/rl_coach/environments/mujoco/common/skybox.xml b/rl_coach/environments/mujoco/common/skybox.xml
new file mode 100644
index 0000000..9d6f7a7
--- /dev/null
+++ b/rl_coach/environments/mujoco/common/skybox.xml
@@ -0,0 +1,6 @@
+<mujoco>
+  <asset>
+      <texture name="skybox" type="skybox" builtin="gradient" rgb1=".4 .6 .8" rgb2="0 0 0"
+               width="800" height="800" mark="random" markrgb="1 1 1"/>
+  </asset>
+</mujoco>
diff --git a/rl_coach/environments/mujoco/common/visual.xml b/rl_coach/environments/mujoco/common/visual.xml
new file mode 100644
index 0000000..ede15ad
--- /dev/null
+++ b/rl_coach/environments/mujoco/common/visual.xml
@@ -0,0 +1,7 @@
+<mujoco>
+  <visual>
+    <headlight ambient=".4 .4 .4" diffuse=".8 .8 .8" specular="0.1 0.1 0.1"/>
+    <map znear=".01"/>
+    <quality shadowsize="2048"/>
+  </visual>
+</mujoco>
diff --git a/rl_coach/environments/mujoco/pendulum_with_goals.py b/rl_coach/environments/mujoco/pendulum_with_goals.py
new file mode 100644
index 0000000..84eb227
--- /dev/null
+++ b/rl_coach/environments/mujoco/pendulum_with_goals.py
@@ -0,0 +1,185 @@
+import numpy as np
+import gym
+import os
+from gym import spaces
+from gym.envs.registration import EnvSpec
+
+from mujoco_py import load_model_from_path, MjSim , MjViewer, MjRenderContextOffscreen
+
+
+class PendulumWithGoals(gym.Env):
+    metadata = {
+        'render.modes': ['human', 'rgb_array'], 'video.frames_per_second': 30
+    }
+
+    def __init__(self, goal_reaching_thresholds=np.array([0.075, 0.075, 0.75]),
+                 goal_not_reached_penalty=-1, goal_reached_reward=0, terminate_on_goal_reaching=True,
+                 time_limit=1000, frameskip=1, random_goals_instead_of_standing_goal=False,
+                 polar_coordinates: bool=False):
+        super().__init__()
+        dir = os.path.dirname(__file__)
+        model = load_model_from_path(dir + "/pendulum_with_goals.xml")
+
+        self.sim = MjSim(model)
+        self.viewer = None
+        self.rgb_viewer = None
+
+        self.frameskip = frameskip
+        self.goal = None
+        self.goal_reaching_thresholds = goal_reaching_thresholds
+        self.goal_not_reached_penalty = goal_not_reached_penalty
+        self.goal_reached_reward = goal_reached_reward
+        self.terminate_on_goal_reaching = terminate_on_goal_reaching
+        self.time_limit = time_limit
+        self.current_episode_steps_counter = 0
+        self.random_goals_instead_of_standing_goal = random_goals_instead_of_standing_goal
+        self.polar_coordinates = polar_coordinates
+
+        # spaces definition
+        self.action_space = spaces.Box(low=-self.sim.model.actuator_ctrlrange[:, 1],
+                                       high=self.sim.model.actuator_ctrlrange[:, 1],
+                                       dtype=np.float32)
+        if self.polar_coordinates:
+            self.observation_space = spaces.Dict({
+                "observation": spaces.Box(low=np.array([-np.pi, -15]),
+                                          high=np.array([np.pi, 15]),
+                                          dtype=np.float32),
+                "desired_goal": spaces.Box(low=np.array([-np.pi, -15]),
+                                           high=np.array([np.pi, 15]),
+                                           dtype=np.float32),
+                "achieved_goal": spaces.Box(low=np.array([-np.pi, -15]),
+                                            high=np.array([np.pi, 15]),
+                                            dtype=np.float32)
+            })
+        else:
+            self.observation_space = spaces.Dict({
+                "observation": spaces.Box(low=np.array([-1, -1, -15]),
+                                          high=np.array([1, 1, 15]),
+                                          dtype=np.float32),
+                "desired_goal": spaces.Box(low=np.array([-1, -1, -15]),
+                                           high=np.array([1, 1, 15]),
+                                           dtype=np.float32),
+                "achieved_goal": spaces.Box(low=np.array([-1, -1, -15]),
+                                            high=np.array([1, 1, 15]),
+                                            dtype=np.float32)
+            })
+
+        self.spec = EnvSpec('PendulumWithGoals-v0')
+        self.spec.reward_threshold = self.goal_not_reached_penalty * self.time_limit
+
+        self.reset()
+
+    def _goal_reached(self):
+        observation = self._get_obs()
+        if np.any(np.abs(observation['achieved_goal'] - observation['desired_goal']) > self.goal_reaching_thresholds):
+            return False
+        else:
+            return True
+
+    def _terminate(self):
+        if (self._goal_reached() and self.terminate_on_goal_reaching) or \
+                        self.current_episode_steps_counter >= self.time_limit:
+            return True
+        else:
+            return False
+
+    def _reward(self):
+        if self._goal_reached():
+            return self.goal_reached_reward
+        else:
+            return self.goal_not_reached_penalty
+
+    def step(self, action):
+        self.sim.data.ctrl[:] = action
+        for _ in range(self.frameskip):
+            self.sim.step()
+
+        self.current_episode_steps_counter += 1
+
+        state = self._get_obs()
+
+        # visualize the angular velocities
+        state_velocity = np.copy(state['observation'][-1] / 20)
+        goal_velocity = self.goal[-1] / 20
+        self.sim.model.site_size[2] = np.array([0.01, 0.01, state_velocity])
+        self.sim.data.mocap_pos[2] = np.array([0.85, 0, 0.75 + state_velocity])
+        self.sim.model.site_size[3] = np.array([0.01, 0.01, goal_velocity])
+        self.sim.data.mocap_pos[3] = np.array([1.15, 0, 0.75 + goal_velocity])
+
+        return state, self._reward(), self._terminate(), {}
+
+    def _get_obs(self):
+
+        """
+        y
+
+        ^
+        |____
+        |   /
+        |  /
+        |~/
+        |/
+        --------> x
+
+        """
+
+        # observation
+        angle = self.sim.data.qpos
+        angular_velocity = self.sim.data.qvel
+        if self.polar_coordinates:
+            observation = np.concatenate([angle - np.pi, angular_velocity])
+        else:
+            x = np.sin(angle)
+            y = np.cos(angle)  # qpos is the angle relative to a standing pole
+            observation = np.concatenate([x, y, angular_velocity])
+
+        return {
+            "observation": observation,
+            "desired_goal": self.goal,
+            "achieved_goal": observation
+        }
+
+    def reset(self):
+        self.current_episode_steps_counter = 0
+
+        # set initial state
+        angle = np.random.uniform(np.pi / 4, 7 * np.pi / 4)
+        angular_velocity = np.random.uniform(-0.05, 0.05)
+        self.sim.data.qpos[0] = angle
+        self.sim.data.qvel[0] = angular_velocity
+        self.sim.step()
+
+        # goal
+        if self.random_goals_instead_of_standing_goal:
+            angle_target = np.random.uniform(-np.pi / 8, np.pi / 8)
+            angular_velocity_target = np.random.uniform(-0.2, 0.2)
+        else:
+            angle_target = 0
+            angular_velocity_target = 0
+
+        # convert target values to goal
+        x_target = np.sin(angle_target)
+        y_target = np.cos(angle_target)
+        if self.polar_coordinates:
+            self.goal = np.array([angle_target - np.pi, angular_velocity_target])
+        else:
+            self.goal = np.array([x_target, y_target, angular_velocity_target])
+
+        # visualize the goal
+        self.sim.data.mocap_pos[0] = [x_target, 0, y_target]
+
+        return self._get_obs()
+
+    def render(self, mode='human', close=False):
+        if mode == 'human':
+            if self.viewer is None:
+                self.viewer = MjViewer(self.sim)
+            self.viewer.render()
+        elif mode == 'rgb_array':
+            if self.rgb_viewer is None:
+                self.rgb_viewer = MjRenderContextOffscreen(self.sim, 0)
+            self.rgb_viewer.render(500, 500)
+            # window size used for old mujoco-py:
+            data = self.rgb_viewer.read_pixels(500, 500, depth=False)
+            # original image is upside-down, so flip it
+            return data[::-1, :, :]
diff --git a/rl_coach/environments/mujoco/pendulum_with_goals.xml b/rl_coach/environments/mujoco/pendulum_with_goals.xml
new file mode 100644
index 0000000..afc5578
--- /dev/null
+++ b/rl_coach/environments/mujoco/pendulum_with_goals.xml
@@ -0,0 +1,42 @@
+<mujoco model="pendulum_with_goals">
+  <include file="./common/visual.xml"/>
+  <include file="./common/skybox.xml"/>
+  <include file="./common/materials.xml"/>
+
+  <option timestep="0.002">
+    <flag contact="disable" energy="enable"/>
+  </option>
+
+  <worldbody>
+    <light name="light" pos="0 0 2"/>
+    <geom name="floor" size="2 2 .2" type="plane" material="grid"/>
+    <camera name="fixed" pos="0 -1.5 2" xyaxes='1 0 0 0 1 1'/>
+    <camera name="lookat" mode="targetbodycom" target="pole" pos="0 -2 1"/>
+    <body name="pole" pos="0 0 .6">
+      <joint name="hinge" type="hinge" axis="0 1 0" damping="0.1"/>
+      <geom name="base" material="decoration" type="cylinder" fromto="0 -.03 0 0 .03 0" size="0.021" mass="0"/>
+      <geom name="pole" material="self" type="capsule" fromto="0 0 0 0 0 0.5" size="0.02" mass="0"/>
+      <geom name="mass" material="effector" type="sphere" pos="0 0 0.5" size="0.05" mass="1"/>
+    </body>
+
+    <body name="end_goal" pos="0 0 0" mocap="true">
+        <site type="sphere" size="0.05" rgba="1 1 0 1" />
+    </body>
+    <!--<body name="sub_goal" pos="0 0 0" mocap="true">-->
+        <!--<site type="sphere" size="0.05" rgba="1 0 1 1" />-->
+    <!--</body>-->
+    <body name="current_velo" pos="0.0 0 0.0" mocap="true">
+        <site type="box" size="0.01 0.01 0.1" rgba="1 1 1 1" />
+    </body>
+    <body name="subgoal_velo" pos="0.0 0 0.0" mocap="true">
+        <site type="box" size="0.01 0.01 0.1" rgba="1 0 1 1" />
+    </body>
+    <body name="zero_velo" pos="1.0 0 0.75" mocap="true">
+        <site type="box" size="0.3 0.01 0.01" rgba="1 0 0 1" />
+    </body>
+  </worldbody>
+
+  <actuator>
+    <motor name="torque" joint="hinge" gear="1" ctrlrange="-2 2" ctrllimited="true"/>
+  </actuator>
+</mujoco>
diff --git a/rl_coach/environments/starcraft2_environment.py b/rl_coach/environments/starcraft2_environment.py
new file mode 100644
index 0000000..6c8593f
--- /dev/null
+++ b/rl_coach/environments/starcraft2_environment.py
@@ -0,0 +1,245 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+
+from enum import Enum
+from typing import Union, List
+
+import numpy as np
+from rl_coach.filters.observation.observation_move_axis_filter import ObservationMoveAxisFilter
+
+try:
+    from pysc2 import maps
+    from pysc2.env import sc2_env
+    from pysc2.env import available_actions_printer
+    from pysc2.lib import actions
+    from pysc2.lib import features
+    from pysc2.env import environment
+    from absl import app
+    from absl import flags
+except ImportError:
+    from rl_coach.logger import failed_imports
+    failed_imports.append("PySc2")
+
+from rl_coach.environments.environment import Environment, EnvironmentParameters, LevelSelection
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.spaces import BoxActionSpace, VectorObservationSpace, PlanarMapsObservationSpace, StateSpace, CompoundActionSpace, \
+    DiscreteActionSpace
+from rl_coach.filters.filter import InputFilter, OutputFilter
+from rl_coach.filters.observation.observation_rescale_to_size_filter import ObservationRescaleToSizeFilter
+from rl_coach.filters.action.linear_box_to_box_map import LinearBoxToBoxMap
+from rl_coach.filters.observation.observation_to_uint8_filter import ObservationToUInt8Filter
+
+FLAGS = flags.FLAGS
+FLAGS(['coach.py'])
+
+SCREEN_SIZE = 84  # will also impact the action space size
+
+# Starcraft Constants
+_NOOP = actions.FUNCTIONS.no_op.id
+_MOVE_SCREEN = actions.FUNCTIONS.Move_screen.id
+_SELECT_ARMY = actions.FUNCTIONS.select_army.id
+_PLAYER_RELATIVE = features.SCREEN_FEATURES.player_relative.index
+_NOT_QUEUED = [0]
+_SELECT_ALL = [0]
+
+
+class StarcraftObservationType(Enum):
+    Features = 0
+    RGB = 1
+
+
+StarcraftInputFilter = InputFilter(is_a_reference_filter=True)
+StarcraftInputFilter.add_observation_filter('screen', 'move_axis', ObservationMoveAxisFilter(0, -1))
+StarcraftInputFilter.add_observation_filter('screen', 'rescaling',
+                                            ObservationRescaleToSizeFilter(
+                                                PlanarMapsObservationSpace(np.array([84, 84, 1]),
+                                                                           low=0, high=255, channels_axis=-1)))
+StarcraftInputFilter.add_observation_filter('screen', 'to_uint8', ObservationToUInt8Filter(0, 255))
+
+StarcraftInputFilter.add_observation_filter('minimap', 'move_axis', ObservationMoveAxisFilter(0, -1))
+StarcraftInputFilter.add_observation_filter('minimap', 'rescaling',
+                                            ObservationRescaleToSizeFilter(
+                                                PlanarMapsObservationSpace(np.array([64, 64, 1]),
+                                                                           low=0, high=255, channels_axis=-1)))
+StarcraftInputFilter.add_observation_filter('minimap', 'to_uint8', ObservationToUInt8Filter(0, 255))
+
+
+StarcraftNormalizingOutputFilter = OutputFilter(is_a_reference_filter=True)
+StarcraftNormalizingOutputFilter.add_action_filter(
+    'normalization', LinearBoxToBoxMap(input_space_low=-SCREEN_SIZE / 2, input_space_high=SCREEN_SIZE / 2 - 1))
+
+
+class StarCraft2EnvironmentParameters(EnvironmentParameters):
+    def __init__(self):
+        super().__init__()
+        self.screen_size = 84
+        self.minimap_size = 64
+        self.feature_minimap_maps_to_use = range(7)
+        self.feature_screen_maps_to_use = range(17)
+        self.observation_type = StarcraftObservationType.Features
+        self.disable_fog = False
+        self.auto_select_all_army = True
+        self.default_input_filter = StarcraftInputFilter
+        self.default_output_filter = StarcraftNormalizingOutputFilter
+        self.use_full_action_space = False
+
+
+    @property
+    def path(self):
+        return 'rl_coach.environments.starcraft2_environment:StarCraft2Environment'
+
+
+# Environment
+class StarCraft2Environment(Environment):
+    def __init__(self, level: LevelSelection, frame_skip: int, visualization_parameters: VisualizationParameters,
+                 seed: Union[None, int]=None, human_control: bool=False,
+                 custom_reward_threshold: Union[int, float]=None,
+                 screen_size: int=84, minimap_size: int=64,
+                 feature_minimap_maps_to_use: List=range(7), feature_screen_maps_to_use: List=range(17),
+                 observation_type: StarcraftObservationType=StarcraftObservationType.Features,
+                 disable_fog: bool=False, auto_select_all_army: bool=True,
+                 use_full_action_space: bool=False, **kwargs):
+        super().__init__(level, seed, frame_skip, human_control, custom_reward_threshold, visualization_parameters)
+
+        self.screen_size = screen_size
+        self.minimap_size = minimap_size
+        self.feature_minimap_maps_to_use = feature_minimap_maps_to_use
+        self.feature_screen_maps_to_use = feature_screen_maps_to_use
+        self.observation_type = observation_type
+        self.features_screen_size = None
+        self.feature_minimap_size = None
+        self.rgb_screen_size = None
+        self.rgb_minimap_size = None
+        if self.observation_type == StarcraftObservationType.Features:
+            self.features_screen_size = screen_size
+            self.feature_minimap_size = minimap_size
+        elif self.observation_type == StarcraftObservationType.RGB:
+            self.rgb_screen_size = screen_size
+            self.rgb_minimap_size = minimap_size
+        self.disable_fog = disable_fog
+        self.auto_select_all_army = auto_select_all_army
+        self.use_full_action_space = use_full_action_space
+
+        # step_mul is the equivalent to frame skipping. Not sure if it repeats actions in between or not though.
+        self.env = sc2_env.SC2Env(map_name=self.env_id, step_mul=frame_skip,
+                                  visualize=self.is_rendered,
+                                  agent_interface_format=sc2_env.AgentInterfaceFormat(
+                                      feature_dimensions=sc2_env.Dimensions(
+                                          screen=self.features_screen_size,
+                                          minimap=self.feature_minimap_size
+                                      )
+                                      # rgb_dimensions=sc2_env.Dimensions(
+                                      #     screen=self.rgb_screen_size,
+                                      #     minimap=self.rgb_screen_size
+                                      # )
+                                  ),
+                                  # feature_screen_size=self.features_screen_size,
+                                  # feature_minimap_size=self.feature_minimap_size,
+                                  # rgb_screen_size=self.rgb_screen_size,
+                                  # rgb_minimap_size=self.rgb_screen_size,
+                                  disable_fog=disable_fog,
+                                  random_seed=self.seed
+                                  )
+
+        # print all the available actions
+        # self.env = available_actions_printer.AvailableActionsPrinter(self.env)
+
+        self.reset_internal_state(True)
+
+        """
+        feature_screen:  [height_map, visibility_map, creep, power, player_id, player_relative, unit_type, selected,
+                          unit_hit_points, unit_hit_points_ratio, unit_energy, unit_energy_ratio, unit_shields, 
+                          unit_shields_ratio, unit_density, unit_density_aa, effects]
+        feature_minimap: [height_map, visibility_map, creep, camera, player_id, player_relative, selecte
+        d]
+        player:          [player_id, minerals, vespene, food_cap, food_army, food_workers, idle_worker_dount, 
+                          army_count, warp_gate_count, larva_count]
+        """
+        self.screen_shape = np.array(self.env.observation_spec()[0]['feature_screen'])
+        self.screen_shape[0] = len(self.feature_screen_maps_to_use)
+        self.minimap_shape = np.array(self.env.observation_spec()[0]['feature_minimap'])
+        self.minimap_shape[0] = len(self.feature_minimap_maps_to_use)
+        self.state_space = StateSpace({
+            "screen": PlanarMapsObservationSpace(shape=self.screen_shape, low=0, high=255, channels_axis=0),
+            "minimap": PlanarMapsObservationSpace(shape=self.minimap_shape, low=0, high=255, channels_axis=0),
+            "measurements": VectorObservationSpace(self.env.observation_spec()[0]["player"][0])
+        })
+        if self.use_full_action_space:
+            action_identifiers = list(self.env.action_spec()[0].functions)
+            num_action_identifiers = len(action_identifiers)
+            action_arguments = [(arg.name, arg.sizes) for arg in self.env.action_spec()[0].types]
+            sub_action_spaces = [DiscreteActionSpace(num_action_identifiers)]
+            for argument in action_arguments:
+                for dimension in argument[1]:
+                    sub_action_spaces.append(DiscreteActionSpace(dimension))
+            self.action_space = CompoundActionSpace(sub_action_spaces)
+        else:
+            self.action_space = BoxActionSpace(2, 0, self.screen_size - 1, ["X-Axis, Y-Axis"],
+                                               default_action=np.array([self.screen_size/2, self.screen_size/2]))
+
+    def _update_state(self):
+        timestep = 0
+        self.screen = self.last_result[timestep].observation.feature_screen
+        # extract only the requested segmentation maps from the observation
+        self.screen = np.take(self.screen, self.feature_screen_maps_to_use, axis=0)
+        self.minimap = self.last_result[timestep].observation.feature_minimap
+        self.measurements = self.last_result[timestep].observation.player
+        self.reward = self.last_result[timestep].reward
+        self.done = self.last_result[timestep].step_type == environment.StepType.LAST
+        self.state = {
+            'screen': self.screen,
+            'minimap': self.minimap,
+            'measurements': self.measurements
+        }
+
+    def _take_action(self, action):
+        if self.use_full_action_space:
+            action_identifier = action[0]
+            action_arguments = action[1:]
+            action = actions.FunctionCall(action_identifier, action_arguments)
+        else:
+            coord = np.array(action[0:2])
+            noop = False
+            coord = coord.round()
+            coord = np.clip(coord, 0, SCREEN_SIZE - 1)
+            self.last_action_idx = coord
+
+            if noop:
+                action = actions.FunctionCall(_NOOP, [])
+            else:
+                action = actions.FunctionCall(_MOVE_SCREEN, [_NOT_QUEUED, coord])
+
+        self.last_result = self.env.step(actions=[action])
+
+    def _restart_environment_episode(self, force_environment_reset=False):
+        # reset the environment
+        self.last_result = self.env.reset()
+
+        # select all the units on the screen
+        if self.auto_select_all_army:
+            self.env.step(actions=[actions.FunctionCall(_SELECT_ARMY, [_SELECT_ALL])])
+
+    def get_rendered_image(self):
+        screen = np.squeeze(np.tile(np.expand_dims(self.screen, -1), (1, 1, 3)))
+        screen = screen / np.max(screen) * 255
+        return screen.astype('uint8')
+
+    def dump_video_of_last_episode(self):
+        from rl_coach.logger import experiment_path
+        self.env._run_config.replay_dir = experiment_path
+        self.env.save_replay('replays')
+        super().dump_video_of_last_episode()
diff --git a/rl_coach/environments/toy_problems/__init__.py b/rl_coach/environments/toy_problems/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/environments/toy_problems/bit_flip.py b/rl_coach/environments/toy_problems/bit_flip.py
new file mode 100644
index 0000000..d674ab5
--- /dev/null
+++ b/rl_coach/environments/toy_problems/bit_flip.py
@@ -0,0 +1,82 @@
+import numpy as np
+import gym
+from gym import spaces
+import random
+
+
+class BitFlip(gym.Env):
+    metadata = {
+        'render.modes': ['human', 'rgb_array'], 'video.frames_per_second': 30
+    }
+
+    def __init__(self, bit_length=16, max_steps=None, mean_zero=False):
+        super(BitFlip, self).__init__()
+        if bit_length < 1:
+            raise ValueError('bit_length must be >= 1, found {}'.format(bit_length))
+        self.bit_length = bit_length
+        self.mean_zero = mean_zero
+
+        if max_steps is None:
+            # default to bit_length
+            self.max_steps = bit_length
+        elif max_steps == 0:
+            self.max_steps = None
+        else:
+            self.max_steps = max_steps
+
+        # spaces documentation: https://gym.openai.com/docs/
+        self.action_space = spaces.Discrete(bit_length)
+        self.observation_space = spaces.Dict({
+            'state': spaces.Box(low=0, high=1, shape=(bit_length, )),
+            'desired_goal': spaces.Box(low=0, high=1, shape=(bit_length, )),
+            'achieved_goal': spaces.Box(low=0, high=1, shape=(bit_length, ))
+        })
+
+        self.reset()
+
+    def _terminate(self):
+        return (self.state == self.goal).all() or self.steps >= self.max_steps
+
+    def _reward(self):
+        return -1 if (self.state != self.goal).any() else 0
+
+    def step(self, action):
+        # action is an int in the range [0, self.bit_length)
+        self.state[action] = int(not self.state[action])
+        self.steps += 1
+
+        return (self._get_obs(), self._reward(), self._terminate(), {})
+
+    def reset(self):
+        self.steps = 0
+
+        self.state = np.array([random.choice([1, 0]) for _ in range(self.bit_length)])
+
+        # make sure goal is not the initial state
+        self.goal = self.state
+        while (self.goal == self.state).all():
+            self.goal = np.array([random.choice([1, 0]) for _ in range(self.bit_length)])
+
+        return self._get_obs()
+
+    def _mean_zero(self, x):
+        if self.mean_zero:
+            return (x - 0.5) / 0.5
+        else:
+            return x
+
+    def _get_obs(self):
+        return {
+            'state': self._mean_zero(self.state),
+            'desired_goal': self._mean_zero(self.goal),
+            'achieved_goal': self._mean_zero(self.state)
+        }
+
+    def render(self, mode='human', close=False):
+        observation = np.zeros((20, 20 * self.bit_length, 3))
+        for bit_idx, (state_bit, goal_bit) in enumerate(zip(self.state, self.goal)):
+            # green if the bit matches
+            observation[:, bit_idx * 20:(bit_idx + 1) * 20, 1] = (state_bit == goal_bit) * 255
+            # red if the bit doesn't match
+            observation[:, bit_idx * 20:(bit_idx + 1) * 20, 0] = (state_bit != goal_bit) * 255
+        return observation
diff --git a/rl_coach/environments/toy_problems/exploration_chain.py b/rl_coach/environments/toy_problems/exploration_chain.py
new file mode 100644
index 0000000..2c83a47
--- /dev/null
+++ b/rl_coach/environments/toy_problems/exploration_chain.py
@@ -0,0 +1,126 @@
+import numpy as np
+import gym
+from gym import spaces
+from enum import Enum
+
+
+class ExplorationChain(gym.Env):
+    metadata = {
+        'render.modes': ['human', 'rgb_array'], 'video.frames_per_second': 30
+    }
+
+    class ObservationType(Enum):
+        OneHot = 0
+        Therm = 1
+
+    def __init__(self, chain_length=16, start_state=1, max_steps=None, observation_type=ObservationType.Therm,
+                 left_state_reward=1/1000, right_state_reward=1, simple_render=True):
+        super().__init__()
+        if chain_length <= 3:
+            raise ValueError('Chain length must be > 3, found {}'.format(chain_length))
+        if not 0 <= start_state < chain_length:
+            raise ValueError('The start state should be within the chain bounds, found {}'.format(start_state))
+        self.chain_length = chain_length
+        self.start_state = start_state
+        self.max_steps = max_steps
+        self.observation_type = observation_type
+        self.left_state_reward = left_state_reward
+        self.right_state_reward = right_state_reward
+        self.simple_render = simple_render
+
+        # spaces documentation: https://gym.openai.com/docs/
+        self.action_space = spaces.Discrete(2)  # 0 -> Go left, 1 -> Go right
+        self.observation_space = spaces.Box(0, 1, shape=(chain_length,))#spaces.MultiBinary(chain_length)
+
+        self.reset()
+
+    def _terminate(self):
+        return self.steps >= self.max_steps
+
+    def _reward(self):
+        if self.state == 0:
+            return self.left_state_reward
+        elif self.state == self.chain_length - 1:
+            return self.right_state_reward
+        else:
+            return 0
+
+    def step(self, action):
+        # action is 0 or 1
+        if action == 0:
+            if 0 < self.state:
+                self.state -= 1
+        elif action == 1:
+            if self.state < self.chain_length - 1:
+                self.state += 1
+        else:
+            raise ValueError("An invalid action was given. The available actions are - 0 or 1, found {}".format(action))
+
+        self.steps += 1
+
+        return self._get_obs(), self._reward(), self._terminate(), {}
+
+    def reset(self):
+        self.steps = 0
+
+        self.state = self.start_state
+
+        return self._get_obs()
+
+    def _get_obs(self):
+        self.observation = np.zeros((self.chain_length,))
+        if self.observation_type == self.ObservationType.OneHot:
+            self.observation[self.state] = 1
+        elif self.observation_type == self.ObservationType.Therm:
+            self.observation[:(self.state+1)] = 1
+
+        return self.observation
+
+    def render(self, mode='human', close=False):
+        if self.simple_render:
+            observation = np.zeros((20, 20*self.chain_length))
+            observation[:, self.state*20:(self.state+1)*20] = 255
+            return observation
+        else:
+            # lazy loading of networkx and matplotlib to allow using the environment without installing them if
+            # necessary
+            import networkx as nx
+            from networkx.drawing.nx_agraph import graphviz_layout
+            import matplotlib.pyplot as plt
+
+            if not hasattr(self, 'G'):
+                self.states = list(range(self.chain_length))
+                self.G = nx.DiGraph(directed=True)
+                for i, origin_state in enumerate(self.states):
+                    if i < self.chain_length - 1:
+                        self.G.add_edge(origin_state,
+                                        origin_state + 1,
+                                        weight=0.5)
+                    if i > 0:
+                        self.G.add_edge(origin_state,
+                                        origin_state - 1,
+                                        weight=0.5, )
+                    if i == 0 or i < self.chain_length - 1:
+                        self.G.add_edge(origin_state,
+                                        origin_state,
+                                        weight=0.5, )
+
+            fig = plt.gcf()
+            if np.all(fig.get_size_inches() != [10, 2]):
+                fig.set_size_inches(5, 1)
+            color = ['y']*(len(self.G))
+            color[self.state] = 'r'
+            options = {
+                'node_color': color,
+                'node_size': 50,
+                'width': 1,
+                'arrowstyle': '-|>',
+                'arrowsize': 5,
+                'font_size': 6
+            }
+            pos = graphviz_layout(self.G, prog='dot', args='-Grankdir=LR')
+            nx.draw_networkx(self.G, pos, arrows=True, **options)
+            fig.canvas.draw()
+            data = np.fromstring(fig.canvas.tostring_rgb(), dtype=np.uint8, sep='')
+            data = data.reshape(fig.canvas.get_width_height()[::-1] + (3,))
+            return data
diff --git a/rl_coach/exploration_policies/README.md b/rl_coach/exploration_policies/README.md
new file mode 100644
index 0000000..dfd3e4e
--- /dev/null
+++ b/rl_coach/exploration_policies/README.md
@@ -0,0 +1,42 @@
+# Exploration Policy
+
+An exploration policy is a module that is responsible for choosing the action according to the action values, the
+current phase, its internal state and the specific exploration policy algorithm.
+
+A custom exploration policy should implement both the exploration policy class and the exploration policy parameters
+class, which defines the parameters and the location of the exploration policy module.
+The parameters of the exploration policy class should match the parameters in the exploration policy parameters class.
+
+Exploration policies typically have some control parameter that defines its current exploration state, and
+a schedule for this parameter. This schedule can be defined using the Schedule class which is defined in
+exploration_policy.py.
+
+A custom implementation should look as follows:
+
+```
+class CustomExplorationParameters(ExplorationParameters):
+    def __init__(self):
+        super().__init__()
+        ...
+
+    @property
+    def path(self):
+        return 'module_path:class_name'
+
+
+class CustomExplorationPolicy(ExplorationPolicy):
+    def __init__(self, action_space: ActionSpace, ...):
+        super().__init__(action_space)
+
+    def reset(self):
+        ...
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        ...
+
+    def change_phase(self, phase):
+        ...
+
+    def get_control_param(self):
+        ...
+```
\ No newline at end of file
diff --git a/rl_coach/exploration_policies/__init__.py b/rl_coach/exploration_policies/__init__.py
new file mode 100644
index 0000000..cf26739
--- /dev/null
+++ b/rl_coach/exploration_policies/__init__.py
@@ -0,0 +1,15 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
diff --git a/rl_coach/exploration_policies/additive_noise.py b/rl_coach/exploration_policies/additive_noise.py
new file mode 100644
index 0000000..7912a31
--- /dev/null
+++ b/rl_coach/exploration_policies/additive_noise.py
@@ -0,0 +1,95 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import numpy as np
+from rl_coach.schedules import Schedule, LinearSchedule
+from rl_coach.spaces import ActionSpace, BoxActionSpace
+
+from rl_coach.core_types import RunPhase, ActionType
+from rl_coach.exploration_policies.exploration_policy import ExplorationPolicy, ExplorationParameters
+
+
+# TODO: consider renaming to gaussian sampling
+class AdditiveNoiseParameters(ExplorationParameters):
+    def __init__(self):
+        super().__init__()
+        self.noise_percentage_schedule = LinearSchedule(0.1, 0.1, 50000)
+        self.evaluation_noise_percentage = 0.05
+
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.additive_noise:AdditiveNoise'
+
+
+class AdditiveNoise(ExplorationPolicy):
+    def __init__(self, action_space: ActionSpace, noise_percentage_schedule: Schedule,
+                 evaluation_noise_percentage: float):
+        """
+        :param action_space: the action space used by the environment
+        :param noise_percentage_schedule: the schedule for the noise variance percentage relative to the absolute range
+                                          of the action space
+        :param evaluation_noise_percentage: the noise variance percentage that will be used during evaluation phases
+        """
+        super().__init__(action_space)
+        self.noise_percentage_schedule = noise_percentage_schedule
+        self.evaluation_noise_percentage = evaluation_noise_percentage
+
+        if not isinstance(action_space, BoxActionSpace):
+            raise ValueError("Additive noise exploration works only for continuous controls."
+                             "The given action space is of type: {}".format(action_space.__class__.__name__))
+
+        if not np.all(-np.inf < action_space.high) or not np.all(action_space.high < np.inf)\
+                or not np.all(-np.inf < action_space.low) or not np.all(action_space.low < np.inf):
+            raise ValueError("Additive noise exploration requires bounded actions")
+
+        # TODO: allow working with unbounded actions by defining the noise in terms of range and not percentage
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        # TODO-potential-bug consider separating internally defined stdev and externally defined stdev into 2 policies
+
+        # set the current noise percentage
+        if self.phase == RunPhase.TEST:
+            current_noise_precentage = self.evaluation_noise_percentage
+        else:
+            current_noise_precentage = self.noise_percentage_schedule.current_value
+
+        # scale the noise to the action space range
+        action_values_std = current_noise_precentage * (self.action_space.high - self.action_space.low)
+
+        # extract the mean values
+        if isinstance(action_values, list):
+            # the action values are expected to be a list with the action mean and optionally the action stdev
+            action_values_mean = action_values[0].squeeze()
+        else:
+            # the action values are expected to be a numpy array representing the action mean
+            action_values_mean = action_values.squeeze()
+
+        # step the noise schedule
+        if self.phase == RunPhase.TRAIN:
+            self.noise_percentage_schedule.step()
+            # the second element of the list is assumed to be the standard deviation
+            if isinstance(action_values, list) and len(action_values) > 1:
+                action_values_std = action_values[1].squeeze()
+
+        # add noise to the action means
+        action = np.random.normal(action_values_mean, action_values_std)
+
+        return action
+
+    def get_control_param(self):
+        return np.ones(self.action_space.shape)*self.noise_percentage_schedule.current_value
diff --git a/rl_coach/exploration_policies/boltzmann.py b/rl_coach/exploration_policies/boltzmann.py
new file mode 100644
index 0000000..a0e3854
--- /dev/null
+++ b/rl_coach/exploration_policies/boltzmann.py
@@ -0,0 +1,59 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import numpy as np
+from rl_coach.schedules import Schedule
+from rl_coach.spaces import ActionSpace
+
+from rl_coach.core_types import RunPhase, ActionType
+from rl_coach.exploration_policies.exploration_policy import ExplorationPolicy, ExplorationParameters
+
+
+class BoltzmannParameters(ExplorationParameters):
+    def __init__(self):
+        super().__init__()
+        self.temperature_schedule = None
+
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.boltzmann:Boltzmann'
+
+
+
+class Boltzmann(ExplorationPolicy):
+    def __init__(self, action_space: ActionSpace, temperature_schedule: Schedule):
+        """
+        :param action_space: the action space used by the environment
+        :param temperature_schedule: the schedule for the temperature parameter of the softmax
+        """
+        super().__init__(action_space)
+        self.temperature_schedule = temperature_schedule
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        if self.phase == RunPhase.TRAIN:
+            self.temperature_schedule.step()
+        # softmax calculation
+        exp_probabilities = np.exp(action_values / self.temperature_schedule.current_value)
+        probabilities = exp_probabilities / np.sum(exp_probabilities)
+        # make sure probs sum to 1
+        probabilities[-1] = 1 - np.sum(probabilities[:-1])
+        # choose actions according to the probabilities
+        return np.random.choice(range(self.action_space.shape), p=probabilities)
+
+    def get_control_param(self):
+        return self.temperature_schedule.current_value
diff --git a/rl_coach/exploration_policies/bootstrapped.py b/rl_coach/exploration_policies/bootstrapped.py
new file mode 100644
index 0000000..c3061b1
--- /dev/null
+++ b/rl_coach/exploration_policies/bootstrapped.py
@@ -0,0 +1,77 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import numpy as np
+from rl_coach.exploration_policies.additive_noise import AdditiveNoiseParameters
+from rl_coach.exploration_policies.e_greedy import EGreedy, EGreedyParameters
+from rl_coach.schedules import Schedule, LinearSchedule
+from rl_coach.spaces import ActionSpace
+
+from rl_coach.core_types import RunPhase, ActionType
+from rl_coach.exploration_policies.exploration_policy import ExplorationParameters
+
+
+class BootstrappedParameters(EGreedyParameters):
+    def __init__(self):
+        super().__init__()
+        self.architecture_num_q_heads = 10
+        self.bootstrapped_data_sharing_probability = 1.0
+        self.epsilon_schedule = LinearSchedule(1, 0.01, 1000000)
+
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.bootstrapped:Bootstrapped'
+
+
+class Bootstrapped(EGreedy):
+    def __init__(self, action_space: ActionSpace, epsilon_schedule: Schedule, evaluation_epsilon: float,
+                 architecture_num_q_heads: int,
+                 continuous_exploration_policy_parameters: ExplorationParameters = AdditiveNoiseParameters(),):
+        """
+        :param action_space: the action space used by the environment
+        :param epsilon_schedule: a schedule for the epsilon values
+        :param evaluation_epsilon: the epsilon value to use for evaluation phases
+        :param continuous_exploration_policy_parameters: the parameters of the continuous exploration policy to use
+                                                         if the e-greedy is used for a continuous policy
+        :param architecture_num_q_heads: the number of q heads to select from
+        """
+        super().__init__(action_space, epsilon_schedule, evaluation_epsilon, continuous_exploration_policy_parameters)
+        self.num_heads = architecture_num_q_heads
+        self.selected_head = 0
+        self.last_action_values = 0
+
+    def select_head(self):
+        self.selected_head = np.random.randint(self.num_heads)
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        # action values are none in case the exploration policy is going to select a random action
+        if action_values is not None:
+            if self.phase == RunPhase.TRAIN:
+                action_values = action_values[self.selected_head]
+            else:
+                # ensemble voting for evaluation
+                top_action_votings = np.argmax(action_values, axis=-1)
+                counts = np.bincount(top_action_votings.squeeze())
+                top_action = np.argmax(counts)
+                # convert the top action to a one hot vector and pass it to e-greedy
+                action_values = np.eye(len(self.action_space.actions))[[top_action]]
+        self.last_action_values = action_values
+        return super().get_action(action_values)
+
+    def get_control_param(self):
+        return self.selected_head
diff --git a/rl_coach/exploration_policies/categorical.py b/rl_coach/exploration_policies/categorical.py
new file mode 100644
index 0000000..c4c1d0f
--- /dev/null
+++ b/rl_coach/exploration_policies/categorical.py
@@ -0,0 +1,48 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import numpy as np
+from rl_coach.spaces import ActionSpace
+
+from rl_coach.core_types import RunPhase, ActionType
+from rl_coach.exploration_policies.exploration_policy import ExplorationPolicy, ExplorationParameters
+
+
+class CategoricalParameters(ExplorationParameters):
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.categorical:Categorical'
+
+
+class Categorical(ExplorationPolicy):
+    def __init__(self, action_space: ActionSpace):
+        """
+        :param action_space: the action space used by the environment
+        """
+        super().__init__(action_space)
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        if self.phase == RunPhase.TRAIN:
+            # choose actions according to the probabilities
+            return np.random.choice(self.action_space.actions, p=action_values)
+        else:
+            # take the action with the highest probability
+            return np.argmax(action_values)
+
+    def get_control_param(self):
+        return 0
diff --git a/architectures/neon_components/losses.py b/rl_coach/exploration_policies/continuous_entropy.py
similarity index 61%
rename from architectures/neon_components/losses.py
rename to rl_coach/exploration_policies/continuous_entropy.py
index 26e8644..95a316c 100644
--- a/architectures/neon_components/losses.py
+++ b/rl_coach/exploration_policies/continuous_entropy.py
@@ -14,15 +14,14 @@
 # limitations under the License.
 #
 
-import ngraph as ng
-import ngraph.frontends.neon as neon
-from ngraph.util.names import name_scope
-import numpy as np
+from rl_coach.exploration_policies.additive_noise import AdditiveNoise, AdditiveNoiseParameters
 
 
-def mean_squared_error(targets, outputs, weights=1.0, scope=""):
-    with name_scope(scope):
-        # TODO: reduce mean over the action axis
-        loss = ng.squared_L2(targets - outputs)
-        weighted_loss = loss * weights
-        return weighted_loss
+class ContinuousEntropyParameters(AdditiveNoiseParameters):
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.continuous_entropy:ContinuousEntropy'
+
+
+class ContinuousEntropy(AdditiveNoise):
+    pass
diff --git a/rl_coach/exploration_policies/e_greedy.py b/rl_coach/exploration_policies/e_greedy.py
new file mode 100644
index 0000000..b1c934d
--- /dev/null
+++ b/rl_coach/exploration_policies/e_greedy.py
@@ -0,0 +1,102 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import numpy as np
+from rl_coach.exploration_policies.additive_noise import AdditiveNoiseParameters
+from rl_coach.schedules import Schedule, LinearSchedule
+from rl_coach.spaces import ActionSpace, DiscreteActionSpace, BoxActionSpace
+from rl_coach.utils import dynamic_import_and_instantiate_module_from_params
+
+from rl_coach.core_types import RunPhase, ActionType
+from rl_coach.exploration_policies.exploration_policy import ExplorationParameters
+from rl_coach.exploration_policies.exploration_policy import ExplorationPolicy
+
+
+class EGreedyParameters(ExplorationParameters):
+    def __init__(self):
+        super().__init__()
+        self.epsilon_schedule = LinearSchedule(0.5, 0.01, 50000)
+        self.evaluation_epsilon = 0.05
+        self.continuous_exploration_policy_parameters = AdditiveNoiseParameters()
+        self.continuous_exploration_policy_parameters.noise_percentage_schedule = LinearSchedule(0.1, 0.1, 50000)
+        # for continuous control -
+        # (see http://www.cs.ubc.ca/~van/papers/2017-TOG-deepLoco/2017-TOG-deepLoco.pdf)
+
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.e_greedy:EGreedy'
+
+
+class EGreedy(ExplorationPolicy):
+    def __init__(self, action_space: ActionSpace, epsilon_schedule: Schedule,
+                 evaluation_epsilon: float,
+                 continuous_exploration_policy_parameters: ExplorationParameters=AdditiveNoiseParameters()):
+        """
+        :param action_space: the action space used by the environment
+        :param epsilon_schedule: a schedule for the epsilon values
+        :param evaluation_epsilon: the epsilon value to use for evaluation phases
+        :param continuous_exploration_policy_parameters: the parameters of the continuous exploration policy to use
+                                                         if the e-greedy is used for a continuous policy
+        """
+        super().__init__(action_space)
+        self.epsilon_schedule = epsilon_schedule
+        self.evaluation_epsilon = evaluation_epsilon
+
+        if isinstance(self.action_space, BoxActionSpace):
+            # for continuous e-greedy (see http://www.cs.ubc.ca/~van/papers/2017-TOG-deepLoco/2017-TOG-deepLoco.pdf)
+            continuous_exploration_policy_parameters.action_space = action_space
+            self.continuous_exploration_policy = \
+                dynamic_import_and_instantiate_module_from_params(continuous_exploration_policy_parameters)
+
+        self.current_random_value = np.random.rand()
+
+    def requires_action_values(self):
+        epsilon = self.evaluation_epsilon if self.phase == RunPhase.TEST else self.epsilon_schedule.current_value
+        return self.current_random_value >= epsilon
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        epsilon = self.evaluation_epsilon if self.phase == RunPhase.TEST else self.epsilon_schedule.current_value
+
+        if isinstance(self.action_space, DiscreteActionSpace):
+            top_action = np.argmax(action_values)
+            if self.current_random_value < epsilon:
+                chosen_action = self.action_space.sample()
+            else:
+                chosen_action = top_action
+        else:
+            if self.current_random_value < epsilon and self.phase == RunPhase.TRAIN:
+                chosen_action = self.action_space.sample()
+            else:
+                chosen_action = self.continuous_exploration_policy.get_action(action_values)
+
+        # step the epsilon schedule and generate a new random value for next time
+        if self.phase == RunPhase.TRAIN:
+            self.epsilon_schedule.step()
+        self.current_random_value = np.random.rand()
+        return chosen_action
+
+    def get_control_param(self):
+        if isinstance(self.action_space, DiscreteActionSpace):
+            return self.evaluation_epsilon if self.phase == RunPhase.TEST else self.epsilon_schedule.current_value
+        elif isinstance(self.action_space, BoxActionSpace):
+            return self.continuous_exploration_policy.get_control_param()
+
+    def change_phase(self, phase):
+        super().change_phase(phase)
+        if isinstance(self.action_space, BoxActionSpace):
+            self.continuous_exploration_policy.change_phase(phase)
diff --git a/exploration_policies/exploration_policy.py b/rl_coach/exploration_policies/exploration_policy.py
similarity index 54%
rename from exploration_policies/exploration_policy.py
rename to rl_coach/exploration_policies/exploration_policy.py
index d211054..4dcd6ef 100644
--- a/exploration_policies/exploration_policy.py
+++ b/rl_coach/exploration_policies/exploration_policy.py
@@ -14,21 +14,30 @@
 # limitations under the License.
 #
 
-import numpy as np
-from utils import *
-from configurations import *
+from typing import List
+
+from rl_coach.base_parameters import Parameters
+from rl_coach.spaces import ActionSpace
+
+from rl_coach.core_types import RunPhase, ActionType
+
+
+class ExplorationParameters(Parameters):
+    def __init__(self):
+        self.action_space = None
+
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.exploration_policy:ExplorationPolicy'
 
 
 class ExplorationPolicy(object):
-    def __init__(self, tuning_parameters):
+    def __init__(self, action_space: ActionSpace):
         """
-        :param tuning_parameters: A Preset class instance with all the running paramaters
-        :type tuning_parameters: Preset
+        :param action_space: the action space used by the environment
         """
         self.phase = RunPhase.HEATUP
-        self.action_space_size = tuning_parameters.env.action_space_size
-        self.action_abs_range = tuning_parameters.env_instance.action_space_abs_range
-        self.discrete_controls = tuning_parameters.env_instance.discrete_controls
+        self.action_space = action_space
 
     def reset(self):
         """
@@ -37,7 +46,7 @@ class ExplorationPolicy(object):
         """
         pass
 
-    def get_action(self, action_values):
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
         """
         Given a list of values corresponding to each action, 
         choose one actions according to the exploration policy
@@ -54,5 +63,14 @@ class ExplorationPolicy(object):
         """
         self.phase = phase
 
+    def requires_action_values(self) -> bool:
+        """
+        Allows exploration policies to define if they require the action values for the current step.
+        This can save up a lot of computation. For example in e-greedy, if the random value generated is smaller
+        than epsilon, the action is completely random, and the action values don't need to be calculated
+        :return: True if the action values are required. False otherwise
+        """
+        return True
+
     def get_control_param(self):
-        return 0
\ No newline at end of file
+        return 0
diff --git a/rl_coach/exploration_policies/greedy.py b/rl_coach/exploration_policies/greedy.py
new file mode 100644
index 0000000..f5b402c
--- /dev/null
+++ b/rl_coach/exploration_policies/greedy.py
@@ -0,0 +1,46 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import numpy as np
+from rl_coach.spaces import ActionSpace, DiscreteActionSpace, BoxActionSpace
+
+from rl_coach.core_types import ActionType
+from rl_coach.exploration_policies.exploration_policy import ExplorationPolicy, ExplorationParameters
+
+
+class GreedyParameters(ExplorationParameters):
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.greedy:Greedy'
+
+
+class Greedy(ExplorationPolicy):
+    def __init__(self, action_space: ActionSpace):
+        """
+        :param action_space: the action space used by the environment
+        """
+        super().__init__(action_space)
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        if type(self.action_space) == DiscreteActionSpace:
+            return np.argmax(action_values)
+        if type(self.action_space) == BoxActionSpace:
+            return action_values
+
+    def get_control_param(self):
+        return 0
diff --git a/rl_coach/exploration_policies/ou_process.py b/rl_coach/exploration_policies/ou_process.py
new file mode 100644
index 0000000..98286bc
--- /dev/null
+++ b/rl_coach/exploration_policies/ou_process.py
@@ -0,0 +1,81 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import numpy as np
+from rl_coach.spaces import ActionSpace, BoxActionSpace, GoalsSpace
+
+from rl_coach.core_types import RunPhase, ActionType
+from rl_coach.exploration_policies.exploration_policy import ExplorationPolicy, ExplorationParameters
+
+
+# Based on on the description in:
+# https://math.stackexchange.com/questions/1287634/implementing-ornstein-uhlenbeck-in-matlab
+class OUProcessParameters(ExplorationParameters):
+    def __init__(self):
+        super().__init__()
+        self.mu = 0
+        self.theta = 0.15
+        self.sigma = 0.2
+        self.dt = 0.01
+
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.ou_process:OUProcess'
+
+
+# Ornstein-Uhlenbeck process
+class OUProcess(ExplorationPolicy):
+    def __init__(self, action_space: ActionSpace, mu: float=0, theta: float=0.15, sigma: float=0.2, dt: float=0.01):
+        """
+        :param action_space: the action space used by the environment
+        """
+        super().__init__(action_space)
+        self.mu = float(mu) * np.ones(self.action_space.shape)
+        self.theta = float(theta)
+        self.sigma = float(sigma) * np.ones(self.action_space.shape)
+        self.state = np.zeros(self.action_space.shape)
+        self.dt = dt
+
+        if not (isinstance(action_space, BoxActionSpace) or isinstance(action_space, GoalsSpace)):
+            raise ValueError("OU process exploration works only for continuous controls."
+                             "The given action space is of type: {}".format(action_space.__class__.__name__))
+
+    def reset(self):
+        self.state = np.zeros(self.action_space.shape)
+
+    def noise(self):
+        x = self.state
+        dx = self.theta * (self.mu - x) * self.dt + self.sigma * np.random.randn(len(x)) * np.sqrt(self.dt)
+        self.state = x + dx
+        return self.state
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        if self.phase == RunPhase.TRAIN:
+            noise = self.noise()
+        else:
+            noise = np.zeros(self.action_space.shape)
+
+        action = action_values.squeeze() + noise
+
+        return action
+
+    def get_control_param(self):
+        if self.phase == RunPhase.TRAIN:
+            return self.state
+        else:
+            return np.zeros(self.action_space.shape)
diff --git a/rl_coach/exploration_policies/truncated_normal.py b/rl_coach/exploration_policies/truncated_normal.py
new file mode 100644
index 0000000..47c141d
--- /dev/null
+++ b/rl_coach/exploration_policies/truncated_normal.py
@@ -0,0 +1,100 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import numpy as np
+from rl_coach.schedules import Schedule, LinearSchedule
+from scipy.stats import truncnorm
+from rl_coach.spaces import ActionSpace, BoxActionSpace
+
+from rl_coach.core_types import RunPhase, ActionType
+from rl_coach.exploration_policies.exploration_policy import ExplorationPolicy, ExplorationParameters
+
+
+class TruncatedNormalParameters(ExplorationParameters):
+    def __init__(self):
+        super().__init__()
+        self.noise_percentage_schedule = LinearSchedule(0.1, 0.1, 50000)
+        self.evaluation_noise_percentage = 0.05
+        self.clip_low = 0
+        self.clip_high = 1
+
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.truncated_normal:TruncatedNormal'
+
+
+class TruncatedNormal(ExplorationPolicy):
+    def __init__(self, action_space: ActionSpace, noise_percentage_schedule: Schedule,
+                 evaluation_noise_percentage: float, clip_low: float, clip_high: float):
+        """
+        :param action_space: the action space used by the environment
+        :param noise_percentage_schedule: the schedule for the noise variance percentage relative to the absolute range
+                                          of the action space
+        :param evaluation_noise_percentage: the noise variance percentage that will be used during evaluation phases
+        """
+        super().__init__(action_space)
+        self.noise_percentage_schedule = noise_percentage_schedule
+        self.evaluation_noise_percentage = evaluation_noise_percentage
+        self.clip_low = clip_low
+        self.clip_high = clip_high
+
+        if not isinstance(action_space, BoxActionSpace):
+            raise ValueError("Truncated normal exploration works only for continuous controls."
+                             "The given action space is of type: {}".format(action_space.__class__.__name__))
+
+        if not np.all(-np.inf < action_space.high) or not np.all(action_space.high < np.inf)\
+                or not np.all(-np.inf < action_space.low) or not np.all(action_space.low < np.inf):
+            raise ValueError("Additive noise exploration requires bounded actions")
+
+        # TODO: allow working with unbounded actions by defining the noise in terms of range and not percentage
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        # set the current noise percentage
+        if self.phase == RunPhase.TEST:
+            current_noise_precentage = self.evaluation_noise_percentage
+        else:
+            current_noise_precentage = self.noise_percentage_schedule.current_value
+
+        # scale the noise to the action space range
+        action_values_std = current_noise_precentage * (self.action_space.high - self.action_space.low)
+
+        # extract the mean values
+        if isinstance(action_values, list):
+            # the action values are expected to be a list with the action mean and optionally the action stdev
+            action_values_mean = action_values[0].squeeze()
+        else:
+            # the action values are expected to be a numpy array representing the action mean
+            action_values_mean = action_values.squeeze()
+
+        # step the noise schedule
+        if self.phase == RunPhase.TRAIN:
+            self.noise_percentage_schedule.step()
+            # the second element of the list is assumed to be the standard deviation
+            if isinstance(action_values, list) and len(action_values) > 1:
+                action_values_std = action_values[1].squeeze()
+
+        # sample from truncated normal distribution
+        normalized_low = (self.clip_low - action_values_mean) / action_values_std
+        normalized_high = (self.clip_high - action_values_mean) / action_values_std
+        distribution = truncnorm(normalized_low, normalized_high, loc=action_values_mean, scale=action_values_std)
+        action = distribution.rvs(1)
+
+        return action
+
+    def get_control_param(self):
+        return np.ones(self.action_space.shape)*self.noise_percentage_schedule.current_value
diff --git a/rl_coach/exploration_policies/ucb.py b/rl_coach/exploration_policies/ucb.py
new file mode 100644
index 0000000..a9d5ead
--- /dev/null
+++ b/rl_coach/exploration_policies/ucb.py
@@ -0,0 +1,83 @@
+#
+# Copyright (c) 2017 Intel Corporation 
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+import numpy as np
+from rl_coach.exploration_policies.additive_noise import AdditiveNoiseParameters
+from rl_coach.exploration_policies.e_greedy import EGreedy, EGreedyParameters
+from rl_coach.schedules import Schedule, LinearSchedule, PieceWiseSchedule
+from rl_coach.spaces import ActionSpace
+
+from rl_coach.core_types import RunPhase, ActionType, EnvironmentSteps
+from rl_coach.exploration_policies.exploration_policy import ExplorationParameters
+
+
+class UCBParameters(EGreedyParameters):
+    def __init__(self):
+        super().__init__()
+        self.architecture_num_q_heads = 10
+        self.bootstrapped_data_sharing_probability = 1.0
+        self.epsilon_schedule = PieceWiseSchedule([
+            (LinearSchedule(1, 0.1, 1000000), EnvironmentSteps(1000000)),
+            (LinearSchedule(0.1, 0.01, 4000000), EnvironmentSteps(4000000))
+        ])
+        self.lamb = 0.1
+
+    @property
+    def path(self):
+        return 'rl_coach.exploration_policies.ucb:UCB'
+
+
+class UCB(EGreedy):
+    def __init__(self, action_space: ActionSpace, epsilon_schedule: Schedule, evaluation_epsilon: float,
+                 architecture_num_q_heads: int, lamb: int,
+                 continuous_exploration_policy_parameters: ExplorationParameters = AdditiveNoiseParameters()):
+        """
+        :param action_space: the action space used by the environment
+        :param epsilon_schedule: a schedule for the epsilon values
+        :param evaluation_epsilon: the epsilon value to use for evaluation phases
+        :param architecture_num_q_heads: the number of q heads to select from
+        :param lamb: lambda coefficient for taking the standard deviation into account
+        :param continuous_exploration_policy_parameters: the parameters of the continuous exploration policy to use
+                                                         if the e-greedy is used for a continuous policy
+        """
+        super().__init__(action_space, epsilon_schedule, evaluation_epsilon, continuous_exploration_policy_parameters)
+        self.num_heads = architecture_num_q_heads
+        self.lamb = lamb
+        self.std = 0
+        self.last_action_values = 0
+
+    def select_head(self):
+        pass
+
+    def get_action(self, action_values: List[ActionType]) -> ActionType:
+        # action values are none in case the exploration policy is going to select a random action
+        if action_values is not None:
+            if self.requires_action_values():
+                mean = np.mean(action_values, axis=0)
+                if self.phase == RunPhase.TRAIN:
+                    self.std = np.std(action_values, axis=0)
+                    self.last_action_values = mean + self.lamb * self.std
+                else:
+                    self.last_action_values = mean
+        return super().get_action(self.last_action_values)
+
+    def get_control_param(self):
+        if self.phase == RunPhase.TRAIN:
+            return np.mean(self.std)
+        else:
+            return 0
diff --git a/rl_coach/filters/README.md b/rl_coach/filters/README.md
new file mode 100644
index 0000000..9ecc817
--- /dev/null
+++ b/rl_coach/filters/README.md
@@ -0,0 +1,70 @@
+A custom observation filter implementation should look like this:
+
+```bash
+from coach.filters.filter import ObservationFilter
+
+class CustomFilter(ObservationFilter):
+  def __init__(self):
+    ...
+  def filter(self, env_response: EnvResponse) -> EnvResponse:
+    ...
+  def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+    ...
+  def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+    ...
+  def reset(self):
+    ...
+```
+
+or for reward filters:
+```bash
+from coach.filters.filter import RewardFilter
+
+class CustomFilter(ObservationFilter):
+  def __init__(self):
+    ...
+  def filter(self, env_response: EnvResponse) -> EnvResponse:
+    ...
+  def get_filtered_reward_space(self, input_reward_space: RewardSpace) -> RewardSpace:
+    ...
+  def reset(self):
+    ...
+```
+
+To create a stack of filters:
+
+```bash
+from coach.filters.observation.observation_rescale_to_size_filter import ObservationRescaleToSizeFilter, RescaleInterpolationType
+from coach.filters.observation.observation_crop_filter import ObservationCropFilter
+from coach.filters.reward.reward_clipping_filter import RewardClippingFilter
+from environments.environment_interface import ObservationSpace
+import numpy as np
+from core_types import EnvResponse
+from filters.filter import InputFilter
+from collections import OrderedDict
+
+env_response = EnvResponse({'observation': np.ones([210, 160])}, reward=100, game_over=False)
+
+rescale = ObservationRescaleToSizeFilter(
+    output_observation_space=ObservationSpace(np.array([110, 84])),
+    rescaling_interpolation_type=RescaleInterpolationType.BILINEAR
+)
+
+crop = ObservationCropFilter(
+    crop_low=np.array([16, 0]),
+    crop_high=np.array([100, 84])
+)
+
+clip = RewardClippingFilter(
+    clipping_low=-1,
+    clipping_high=1
+)
+
+input_filter = InputFilter(
+    observation_filters=OrderedDict([('rescale', rescale), ('crop', crop)]),
+    reward_filters=OrderedDict([('clip', clip)])
+)
+
+result = input_filter.filter(env_response)
+
+```
\ No newline at end of file
diff --git a/rl_coach/filters/__init__.py b/rl_coach/filters/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/filters/action/__init__.py b/rl_coach/filters/action/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/filters/action/action_filter.py b/rl_coach/filters/action/action_filter.py
new file mode 100644
index 0000000..49d1442
--- /dev/null
+++ b/rl_coach/filters/action/action_filter.py
@@ -0,0 +1,69 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from rl_coach.spaces import ActionSpace
+
+from rl_coach.core_types import ActionType
+from rl_coach.filters.filter import Filter
+
+
+class ActionFilter(Filter):
+    def __init__(self, input_action_space: ActionSpace=None):
+        self.input_action_space = input_action_space
+        self.output_action_space = None
+        super().__init__()
+
+    def get_unfiltered_action_space(self, output_action_space: ActionSpace) -> ActionSpace:
+        """
+        This function should contain the logic for getting the unfiltered action space
+        :param output_action_space: the output action space
+        :return: the unfiltered action space
+        """
+        return output_action_space
+
+    def validate_output_action_space(self, output_action_space: ActionSpace):
+        """
+        A function that implements validation of the output action space
+        :param output_action_space: the input action space
+        :return: None
+        """
+        pass
+
+    def validate_output_action(self, action: ActionType):
+        """
+        A function that verifies that the given action is in the expected output action space
+        :param action: an action to validate
+        :return: None
+        """
+        if not self.output_action_space.val_matches_space_definition(action):
+            raise ValueError("The given action ({}) does not match the action space ({})"
+                             .format(action, self.output_action_space))
+
+    def filter(self, action: ActionType) -> ActionType:
+        """
+        A function that transforms from the agent's action space to the environment's action space
+        :param action: an action to transform
+        :return: transformed action
+        """
+        raise NotImplementedError("")
+
+    def reverse_filter(self, action: ActionType) -> ActionType:
+        """
+        A function that transforms from the environment's action space to the agent's action space
+        :param action: an action to transform
+        :return: transformed action
+        """
+        raise NotImplementedError("")
\ No newline at end of file
diff --git a/rl_coach/filters/action/attention_discretization.py b/rl_coach/filters/action/attention_discretization.py
new file mode 100644
index 0000000..0cd4928
--- /dev/null
+++ b/rl_coach/filters/action/attention_discretization.py
@@ -0,0 +1,66 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union, List
+
+import numpy as np
+from rl_coach.filters.action.box_discretization import BoxDiscretization
+
+from rl_coach.filters.action.partial_discrete_action_space_map import PartialDiscreteActionSpaceMap
+from rl_coach.spaces import AttentionActionSpace, BoxActionSpace, DiscreteActionSpace
+
+
+class AttentionDiscretization(PartialDiscreteActionSpaceMap):
+    """
+    Given a box action space, this is used to discretize the space.
+    The discretization is achieved by creating a grid in the space with num_bins_per_dimension bins per dimension in the
+    space. Each discrete action is mapped to a single sub-box in the BoxActionSpace action space.
+    """
+    def __init__(self, num_bins_per_dimension: Union[int, List[int]], force_int_bins=False):
+        # we allow specifying either a single number for all dimensions, or a single number per dimension in the target
+        # action space
+        self.num_bins_per_dimension = num_bins_per_dimension
+
+        self.force_int_bins = force_int_bins
+
+        # TODO: this will currently only work for attention spaces with 2 dimensions. generalize it.
+
+        super().__init__()
+
+    def validate_output_action_space(self, output_action_space: AttentionActionSpace):
+        if not isinstance(output_action_space, AttentionActionSpace):
+            raise ValueError("AttentionActionSpace discretization only works with an output space of type AttentionActionSpace. "
+                             "The given output space is {}".format(output_action_space))
+
+    def get_unfiltered_action_space(self, output_action_space: AttentionActionSpace) -> DiscreteActionSpace:
+        if isinstance(self.num_bins_per_dimension, int):
+            self.num_bins_per_dimension = [self.num_bins_per_dimension] * output_action_space.shape[0]
+
+        # create a discrete to linspace map to ease the extraction of attention actions
+        discrete_to_box = BoxDiscretization([n+1 for n in self.num_bins_per_dimension],
+                                            self.force_int_bins)
+        discrete_to_box.get_unfiltered_action_space(BoxActionSpace(output_action_space.shape,
+                                                                   output_action_space.low,
+                                                                   output_action_space.high), )
+
+        rows, cols = self.num_bins_per_dimension
+        start_ind = [i * (cols + 1) + j for i in range(rows + 1) if i < rows for j in range(cols + 1) if j < cols]
+        end_ind = [i + cols + 2 for i in start_ind]
+        self.target_actions = [np.array([discrete_to_box.target_actions[start],
+                                         discrete_to_box.target_actions[end]])
+                               for start, end in zip(start_ind, end_ind)]
+
+        return super().get_unfiltered_action_space(output_action_space)
diff --git a/rl_coach/filters/action/box_discretization.py b/rl_coach/filters/action/box_discretization.py
new file mode 100644
index 0000000..5d2c0a7
--- /dev/null
+++ b/rl_coach/filters/action/box_discretization.py
@@ -0,0 +1,70 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from itertools import product
+from typing import Union, List
+
+import numpy as np
+
+from rl_coach.filters.action.partial_discrete_action_space_map import PartialDiscreteActionSpaceMap
+from rl_coach.spaces import BoxActionSpace, DiscreteActionSpace
+
+
+class BoxDiscretization(PartialDiscreteActionSpaceMap):
+    """
+    Given a box action space, this is used to discretize the space.
+    The discretization is achieved by creating a grid in the space with num_bins_per_dimension bins per dimension in the
+    space. Each discrete action is mapped to a single N dimensional action in the BoxActionSpace action space.
+    """
+    def __init__(self, num_bins_per_dimension: Union[int, List[int]], force_int_bins=False):
+        """
+        :param num_bins_per_dimension: The number of bins to use for each dimension of the target action space.
+                                       The bins will be spread out uniformly over this space
+        :param force_int_bins: force the bins to represent only integer actions. for example, if the action space is in
+                               the range 0-10 and there are 5 bins, then the bins will be placed at 0, 2, 5, 7, 10,
+                               instead of 0, 2.5, 5, 7.5, 10.
+        """
+        # we allow specifying either a single number for all dimensions, or a single number per dimension in the target
+        # action space
+        self.num_bins_per_dimension = num_bins_per_dimension
+        self.force_int_bins = force_int_bins
+        super().__init__()
+
+    def validate_output_action_space(self, output_action_space: BoxActionSpace):
+        if not isinstance(output_action_space, BoxActionSpace):
+            raise ValueError("BoxActionSpace discretization only works with an output space of type BoxActionSpace. "
+                             "The given output space is {}".format(output_action_space))
+
+        if len(self.num_bins_per_dimension) != output_action_space.shape:
+            # TODO: this check is not sufficient. it does not deal with actions spaces with more than one axis
+            raise ValueError("The length of the list of bins per dimension ({}) does not match the number of "
+                             "dimensions in the action space ({})"
+                             .format(len(self.num_bins_per_dimension), output_action_space))
+
+    def get_unfiltered_action_space(self, output_action_space: BoxActionSpace) -> DiscreteActionSpace:
+        if isinstance(self.num_bins_per_dimension, int):
+            self.num_bins_per_dimension = np.ones(output_action_space.shape) * self.num_bins_per_dimension
+
+        bins = []
+        for i in range(len(output_action_space.low)):
+            dim_bins = np.linspace(output_action_space.low[i], output_action_space.high[i],
+                                   self.num_bins_per_dimension[i])
+            if self.force_int_bins:
+                dim_bins = dim_bins.astype(int)
+            bins.append(dim_bins)
+        self.target_actions = [list(action) for action in list(product(*bins))]
+
+        return super().get_unfiltered_action_space(output_action_space)
diff --git a/rl_coach/filters/action/box_masking.py b/rl_coach/filters/action/box_masking.py
new file mode 100644
index 0000000..e533a55
--- /dev/null
+++ b/rl_coach/filters/action/box_masking.py
@@ -0,0 +1,83 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.spaces import BoxActionSpace
+
+from rl_coach.core_types import ActionType
+from rl_coach.filters.action.action_filter import ActionFilter
+
+
+class BoxMasking(ActionFilter):
+    """
+    Masks a box action space by allowing only selecting a subset of the space
+    For example,
+    - the target action space has actions of shape 1 with values between 10 and 32
+    - we mask the target action space so that only the action 20 to 25 can be chosen
+    The actions will be between 0 to 5 and the mapping will add an offset of 20 to the incoming actions
+    The shape of the source and target action spaces is always the same
+    """
+    def __init__(self,
+                 masked_target_space_low: Union[None, int, float, np.ndarray],
+                 masked_target_space_high: Union[None, int, float, np.ndarray]):
+        """
+        :param masked_target_space_low: the lowest values that can be chosen in the target action space
+        :param masked_target_space_high: the highest values that can be chosen in the target action space
+        """
+        self.masked_target_space_low = masked_target_space_low
+        self.masked_target_space_high = masked_target_space_high
+        self.offset = masked_target_space_low
+        super().__init__()
+
+    def set_masking(self, masked_target_space_low: Union[None, int, float, np.ndarray],
+                    masked_target_space_high: Union[None, int, float, np.ndarray]):
+        self.masked_target_space_low = masked_target_space_low
+        self.masked_target_space_high = masked_target_space_high
+        self.offset = masked_target_space_low
+        if self.output_action_space:
+            self.validate_output_action_space(self.output_action_space)
+            self.input_action_space = BoxActionSpace(self.output_action_space.shape,
+                                                     low=0,
+                                                     high=self.masked_target_space_high - self.masked_target_space_low)
+
+    def validate_output_action_space(self, output_action_space: BoxActionSpace):
+        if not isinstance(output_action_space, BoxActionSpace):
+            raise ValueError("BoxActionSpace discretization only works with an output space of type BoxActionSpace. "
+                             "The given output space is {}".format(output_action_space))
+        if self.masked_target_space_low is None or self.masked_target_space_high is None:
+            raise ValueError("The masking target space size was not set. Please call set_masking.")
+        if not (np.all(output_action_space.low <= self.masked_target_space_low)
+                and np.all(self.masked_target_space_low <= output_action_space.high)):
+            raise ValueError("The low values for masking the action space ({}) are not within the range of the "
+                             "target space (low = {}, high = {})"
+                             .format(self.masked_target_space_low, output_action_space.low, output_action_space.high))
+        if not (np.all(output_action_space.low <= self.masked_target_space_high)
+                and np.all(self.masked_target_space_high <= output_action_space.high)):
+            raise ValueError("The high values for masking the action space ({}) are not within the range of the "
+                             "target space (low = {}, high = {})"
+                             .format(self.masked_target_space_high, output_action_space.low, output_action_space.high))
+
+    def get_unfiltered_action_space(self, output_action_space: BoxActionSpace) -> BoxActionSpace:
+        self.output_action_space = output_action_space
+        self.input_action_space = BoxActionSpace(output_action_space.shape,
+                                                 low=0,
+                                                 high=self.masked_target_space_high - self.masked_target_space_low)
+        return self.input_action_space
+
+    def filter(self, action: ActionType) -> ActionType:
+        return action + self.offset
diff --git a/rl_coach/filters/action/full_discrete_action_space_map.py b/rl_coach/filters/action/full_discrete_action_space_map.py
new file mode 100644
index 0000000..135f5ac
--- /dev/null
+++ b/rl_coach/filters/action/full_discrete_action_space_map.py
@@ -0,0 +1,32 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from rl_coach.filters.action.partial_discrete_action_space_map import PartialDiscreteActionSpaceMap
+from rl_coach.spaces import ActionSpace, DiscreteActionSpace
+
+
+class FullDiscreteActionSpaceMap(PartialDiscreteActionSpaceMap):
+    """
+    Maps all the actions in the output space to discrete actions in the action space.
+    For example, if there are 10 multiselect actions in the output space, the actions 0-9 will be mapped to those
+    multiselect actions.
+    """
+    def __init__(self):
+        super().__init__()
+
+    def get_unfiltered_action_space(self, output_action_space: ActionSpace) -> DiscreteActionSpace:
+        self.target_actions = output_action_space.actions
+        return super().get_unfiltered_action_space(output_action_space)
diff --git a/rl_coach/filters/action/linear_box_to_box_map.py b/rl_coach/filters/action/linear_box_to_box_map.py
new file mode 100644
index 0000000..9cafb6a
--- /dev/null
+++ b/rl_coach/filters/action/linear_box_to_box_map.py
@@ -0,0 +1,60 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Union
+
+import numpy as np
+from rl_coach.spaces import BoxActionSpace
+
+from rl_coach.core_types import ActionType
+from rl_coach.filters.action.action_filter import ActionFilter
+
+
+class LinearBoxToBoxMap(ActionFilter):
+    """
+    Maps a box action space to a box action space.
+    For example,
+    - the source action space has actions of shape 1 with values between -42 and -10,
+    - the target action space has actions of shape 1 with values between 10 and 32
+    The mapping will add an offset of 52 to the incoming actions and then multiply them by 22/32 to scale them to the
+    target action space
+    The shape of the source and target action spaces is always the same
+    """
+    def __init__(self,
+                 input_space_low: Union[None, int, float, np.ndarray],
+                 input_space_high: Union[None, int, float, np.ndarray]):
+        self.input_space_low = input_space_low
+        self.input_space_high = input_space_high
+        self.rescale = None
+        self.offset = None
+        super().__init__()
+
+    def validate_output_action_space(self, output_action_space: BoxActionSpace):
+        if not isinstance(output_action_space, BoxActionSpace):
+            raise ValueError("BoxActionSpace discretization only works with an output space of type BoxActionSpace. "
+                             "The given output space is {}".format(output_action_space))
+
+    def get_unfiltered_action_space(self, output_action_space: BoxActionSpace) -> BoxActionSpace:
+        self.input_action_space = BoxActionSpace(output_action_space.shape, self.input_space_low, self.input_space_high)
+        self.rescale = \
+            (output_action_space.high - output_action_space.low) / (self.input_space_high - self.input_space_low)
+        self.offset = output_action_space.low - self.input_space_low
+        self.output_action_space = output_action_space
+        return self.input_action_space
+
+    def filter(self, action: ActionType) -> ActionType:
+        return self.output_action_space.low + (action - self.input_space_low) * self.rescale
+
diff --git a/rl_coach/filters/action/partial_discrete_action_space_map.py b/rl_coach/filters/action/partial_discrete_action_space_map.py
new file mode 100644
index 0000000..b4caf80
--- /dev/null
+++ b/rl_coach/filters/action/partial_discrete_action_space_map.py
@@ -0,0 +1,54 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List
+
+from rl_coach.spaces import DiscreteActionSpace, ActionSpace
+
+from rl_coach.core_types import ActionType
+from rl_coach.filters.action.action_filter import ActionFilter
+
+
+class PartialDiscreteActionSpaceMap(ActionFilter):
+    """
+    Maps the given actions from the output space to discrete actions in the action space.
+    For example, if there are 10 multiselect actions in the output space, the actions 0-9 will be mapped to those
+    multiselect actions.
+    """
+    def __init__(self, target_actions: List[ActionType]=None, descriptions: List[str]=None):
+        self.target_actions = target_actions
+        self.descriptions = descriptions
+        super().__init__()
+
+    def validate_output_action_space(self, output_action_space: ActionSpace):
+        if not self.target_actions:
+            raise ValueError("The target actions were not set")
+        for v in self.target_actions:
+            if not output_action_space.val_matches_space_definition(v):
+                raise ValueError("The values in the output actions ({}) do not match the output action "
+                                 "space definition ({})".format(v, output_action_space))
+
+    def get_unfiltered_action_space(self, output_action_space: ActionSpace) -> DiscreteActionSpace:
+        self.output_action_space = output_action_space
+        self.input_action_space = DiscreteActionSpace(len(self.target_actions), self.descriptions)
+        return self.input_action_space
+
+    def filter(self, action: ActionType) -> ActionType:
+        return self.target_actions[action]
+
+    def reverse_filter(self, action: ActionType) -> ActionType:
+        return [(action == x).all() for x in self.target_actions].index(True)
+
diff --git a/rl_coach/filters/filter.py b/rl_coach/filters/filter.py
new file mode 100644
index 0000000..42d0b8e
--- /dev/null
+++ b/rl_coach/filters/filter.py
@@ -0,0 +1,418 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+from collections import OrderedDict
+from copy import deepcopy
+from typing import Dict, Union, List
+
+from rl_coach.spaces import ActionSpace, RewardSpace, ObservationSpace
+from rl_coach.core_types import EnvResponse, ActionInfo, Transition
+from rl_coach.utils import force_list
+
+
+class Filter(object):
+    def __init__(self):
+        pass
+
+    def reset(self) -> None:
+        """
+        Called from reset() and implements the reset logic for the filter.
+        :return: None
+        """
+        pass
+
+    def filter(self, env_response: Union[EnvResponse, Transition], update_internal_state: bool=True) \
+            -> Union[EnvResponse, Transition]:
+        """
+        Filter some values in the env and return the filtered env_response
+        This is the function that each filter should update
+        :param update_internal_state: should the filter's internal state change due to this call
+        :param env_response: the input env_response
+        :return: the filtered env_response
+        """
+        raise NotImplementedError("")
+
+    def set_device(self, device) -> None:
+        """
+        An optional function that allows the filter to get the device if it is required to use tensorflow ops
+        :param device: the device to use
+        :return: None
+        """
+        pass
+
+    def set_session(self, sess) -> None:
+        """
+        An optional function that allows the filter to get the session if it is required to use tensorflow ops
+        :param sess: the session
+        :return: None
+        """
+        pass
+
+
+class OutputFilter(Filter):
+    """
+    An output filter is a module that filters the output from an agent to the environment.
+    """
+    def __init__(self, action_filters: OrderedDict([(str, 'ActionFilter')])=None,
+                 is_a_reference_filter: bool=False):
+        super().__init__()
+
+        if action_filters is None:
+            action_filters = OrderedDict([])
+        self._action_filters = action_filters
+
+        # We do not want to allow reference filters such as Atari to be used directly. These have to be duplicated first
+        # and only then can change their values so to keep their original params intact for other agents in the graph.
+        self.i_am_a_reference_filter = is_a_reference_filter
+
+    def __call__(self):
+        duplicate = deepcopy(self)
+        duplicate.i_am_a_reference_filter = False
+        return duplicate
+
+    def set_device(self, device) -> None:
+        """
+        An optional function that allows the filter to get the device if it is required to use tensorflow ops
+        :param device: the device to use
+        :return: None
+        """
+        [f.set_device(device) for f in self.action_filters.values()]
+
+    def set_session(self, sess) -> None:
+        """
+        An optional function that allows the filter to get the session if it is required to use tensorflow ops
+        :param sess: the session
+        :return: None
+        """
+        [f.set_session(sess) for f in self.action_filters.values()]
+
+    def filter(self, action_info: ActionInfo) -> ActionInfo:
+        """
+        A wrapper around _filter which first copies the action_info so that we don't change the original one
+        This function should not be updated!
+        :param action_info: the input action_info
+        :return: the filtered action_info
+        """
+        if self.i_am_a_reference_filter:
+            raise Exception("The filter being used is a reference filter. It is not to be used directly. "
+                            "Instead get a duplicate from it by calling __call__.")
+        if len(self.action_filters.values()) == 0:
+            return action_info
+        filtered_action_info = copy.deepcopy(action_info)
+        filtered_action = filtered_action_info.action
+        for filter in reversed(self.action_filters.values()):
+            filtered_action = filter.filter(filtered_action)
+
+        filtered_action_info.action = filtered_action
+
+        return filtered_action_info
+
+    def reverse_filter(self, action_info: ActionInfo) -> ActionInfo:
+        """
+        A wrapper around _reverse_filter which first copies the action_info so that we don't change the original one
+        This function should not be updated!
+        :param action_info: the input action_info
+        :return: the filtered action_info
+        """
+        if self.i_am_a_reference_filter:
+            raise Exception("The filter being used is a reference filter. It is not to be used directly. "
+                            "Instead get a duplicate from it by calling __call__.")
+        filtered_action_info = copy.deepcopy(action_info)
+        filtered_action = filtered_action_info.action
+        for filter in self.action_filters.values():
+            filter.validate_output_action(filtered_action)
+            filtered_action = filter.reverse_filter(filtered_action)
+
+        filtered_action_info.action = filtered_action
+
+        return filtered_action_info
+
+    def get_unfiltered_action_space(self, output_action_space: ActionSpace) -> ActionSpace:
+        """
+        Given the output action space, returns the corresponding unfiltered action space
+        This function should not be updated!
+        :param output_action_space: the output action space
+        :return: the unfiltered action space
+        """
+        unfiltered_action_space = copy.deepcopy(output_action_space)
+        for filter in self._action_filters.values():
+            new_unfiltered_action_space = filter.get_unfiltered_action_space(unfiltered_action_space)
+            filter.validate_output_action_space(unfiltered_action_space)
+            unfiltered_action_space = new_unfiltered_action_space
+        return unfiltered_action_space
+
+    def reset(self) -> None:
+        """
+        Reset any internal memory stored in the filter.
+        This function should not be updated!
+        This is useful for stateful filters which stores information on previous filter calls.
+        :return: None
+        """
+        [action_filter.reset() for action_filter in self._action_filters.values()]
+
+    @property
+    def action_filters(self) -> OrderedDict([(str, 'ActionFilter')]):
+        return self._action_filters
+
+    @action_filters.setter
+    def action_filters(self, val: OrderedDict([(str, 'ActionFilter')])):
+        self._action_filters = val
+
+    def add_action_filter(self, filter_name: str, filter: 'ActionFilter', add_as_the_first_filter: bool=False):
+        """
+        Add an action filter to the filters list
+        :param filter_name: the filter name
+        :param filter: the filter to add
+        :param add_as_the_first_filter: add the filter to the top of the filters stack
+        :return: None
+        """
+        self._action_filters[filter_name] = filter
+        if add_as_the_first_filter:
+            self._action_filters.move_to_end(filter_name, last=False)
+
+    def remove_action_filter(self, filter_name: str) -> None:
+        """
+        Remove an action filter from the filters list
+        :param filter_name: the filter name
+        :return: None
+        """
+        del self._action_filters[filter_name]
+
+
+class NoOutputFilter(OutputFilter):
+    """
+    Creates an empty output filter. Used only for readability when creating the presets
+    """
+    def __init__(self):
+        super().__init__(is_a_reference_filter=False)
+
+
+class InputFilter(Filter):
+    """
+    An input filter is a module that filters the input from an environment to the agent.
+    """
+    def __init__(self, observation_filters: Dict[str, Dict[str, 'ObservationFilter']]=None,
+                 reward_filters: Dict[str, 'RewardFilter']=None,
+                 is_a_reference_filter: bool=False):
+        super().__init__()
+        if observation_filters is None:
+            observation_filters = {}
+        if reward_filters is None:
+            reward_filters = OrderedDict([])
+        self._observation_filters = observation_filters
+        self._reward_filters = reward_filters
+
+        # We do not want to allow reference filters such as Atari to be used directly. These have to be duplicated first
+        # and only then can change their values so to keep their original params intact for other agents in the graph.
+        self.i_am_a_reference_filter = is_a_reference_filter
+
+    def __call__(self):
+        duplicate = deepcopy(self)
+        duplicate.i_am_a_reference_filter = False
+        return duplicate
+
+    def set_device(self, device) -> None:
+        """
+        An optional function that allows the filter to get the device if it is required to use tensorflow ops
+        :param device: the device to use
+        :return: None
+        """
+        [f.set_device(device) for f in self.reward_filters.values()]
+        [[f.set_device(device) for f in filters.values()] for filters in self.observation_filters.values()]
+
+    def set_session(self, sess) -> None:
+        """
+        An optional function that allows the filter to get the session if it is required to use tensorflow ops
+        :param sess: the session
+        :return: None
+        """
+        [f.set_session(sess) for f in self.reward_filters.values()]
+        [[f.set_session(sess) for f in filters.values()] for filters in self.observation_filters.values()]
+
+    def filter(self, unfiltered_data: Union[EnvResponse, List[EnvResponse], Transition, List[Transition]],
+               update_internal_state: bool=True, deep_copy: bool=True) -> Union[List[EnvResponse], List[Transition]]:
+        """
+        A wrapper around _filter which first copies the env_response so that we don't change the original one
+        This function should not be updated!
+        :param unfiltered_data: the input data
+        :param update_internal_state: should the filter's internal state change due to this call
+        :return: the filtered env_response
+        """
+        if self.i_am_a_reference_filter:
+            raise Exception("The filter being used is a reference filter. It is not to be used directly. "
+                            "Instead get a duplicate from it by calling __call__.")
+        if deep_copy:
+            filtered_data = copy.deepcopy(unfiltered_data)
+        else:
+            filtered_data = [copy.copy(t) for t in unfiltered_data]
+        filtered_data = force_list(filtered_data)
+
+        # TODO: implement observation space validation
+        # filter observations
+        if isinstance(filtered_data[0], Transition):
+            state_objects_to_filter = [[f.state for f in filtered_data],
+                                       [f.next_state for f in filtered_data]]
+        elif isinstance(filtered_data[0], EnvResponse):
+            state_objects_to_filter = [[f.next_state for f in filtered_data]]
+        else:
+            raise ValueError("unfiltered_data should be either of type EnvResponse or Transition. ")
+
+        for state_object_list in state_objects_to_filter:
+            for observation_name, filters in self._observation_filters.items():
+                if observation_name in state_object_list[0].keys():
+                    for filter in filters.values():
+                        data_to_filter = [state_object[observation_name] for state_object in state_object_list]
+                        if filter.supports_batching:
+                            filtered_observations = filter.filter(
+                                data_to_filter, update_internal_state=update_internal_state)
+                        else:
+                            filtered_observations = []
+                            for data_point in data_to_filter:
+                                filtered_observations.append(filter.filter(
+                                    data_point, update_internal_state=update_internal_state))
+
+                        for i, state_object in enumerate(state_object_list):
+                            state_object[observation_name] = filtered_observations[i]
+
+        # filter reward
+        for f in filtered_data:
+            filtered_reward = f.reward
+            for filter in self._reward_filters.values():
+                filtered_reward = filter.filter(filtered_reward, update_internal_state)
+            f.reward = filtered_reward
+
+        return filtered_data
+
+
+    def get_filtered_observation_space(self, observation_name: str,
+                                       input_observation_space: ObservationSpace) -> ObservationSpace:
+        """
+        Given the input observation space, returns the corresponding filtered observation space
+        This function should not be updated!
+        :param observation_name: the name of the observation to which we want to calculate the filtered space
+        :param input_observation_space: the input observation space
+        :return: the filtered observation space
+        """
+        filtered_observation_space = copy.deepcopy(input_observation_space)
+        if observation_name in self._observation_filters.keys():
+            for filter in self._observation_filters[observation_name].values():
+                filter.validate_input_observation_space(filtered_observation_space)
+                filtered_observation_space = filter.get_filtered_observation_space(filtered_observation_space)
+        return filtered_observation_space
+
+    def get_filtered_reward_space(self, input_reward_space: RewardSpace) -> RewardSpace:
+        """
+        Given the input reward space, returns the corresponding filtered reward space
+        This function should not be updated!
+        :param input_reward_space: the input reward space
+        :return: the filtered reward space
+        """
+        filtered_reward_space = copy.deepcopy(input_reward_space)
+        for filter in self._reward_filters.values():
+            filtered_reward_space = filter.get_filtered_reward_space(filtered_reward_space)
+        return filtered_reward_space
+
+    def reset(self) -> None:
+        """
+        Reset any internal memory stored in the filter.
+        This function should not be updated!
+        This is useful for stateful filters which stores information on previous filter calls.
+        :return: None
+        """
+        for curr_observation_filters in self._observation_filters.values():
+            [observation_filter.reset() for observation_filter in curr_observation_filters.values()]
+        [reward_filter.reset() for reward_filter in self._reward_filters.values()]
+
+    @property
+    def observation_filters(self) -> Dict[str, Dict[str, 'ObservationFilter']]:
+        return self._observation_filters
+
+    @observation_filters.setter
+    def observation_filters(self, val: Dict[str, Dict[str, 'ObservationFilter']]):
+        self._observation_filters = val
+
+    @property
+    def reward_filters(self) -> OrderedDict([(str, 'RewardFilter')]):
+        return self._reward_filters
+
+    @reward_filters.setter
+    def reward_filters(self, val: OrderedDict([(str, 'RewardFilter')])):
+        self._reward_filters = val
+
+    def copy_filters_from_one_observation_to_another(self, from_observation: str, to_observation: str):
+        """
+        Copy all the filters created for some observation to another observation
+        :param from_observation: the source observation to copy from
+        :param to_observation: the target observation to copy to
+        :return: None
+        """
+        self._observation_filters[to_observation] = copy.deepcopy(self._observation_filters[from_observation])
+
+    def add_observation_filter(self, observation_name: str, filter_name: str, filter: 'ObservationFilter',
+                               add_as_the_first_filter: bool=False):
+        """
+        Add an observation filter to the filters list
+        :param observation_name: the name of the observation to apply to
+        :param filter_name: the filter name
+        :param filter: the filter to add
+        :param add_as_the_first_filter: add the filter to the top of the filters stack
+        :return: None
+        """
+        if observation_name not in self._observation_filters.keys():
+            self._observation_filters[observation_name] = OrderedDict()
+        self._observation_filters[observation_name][filter_name] = filter
+        if add_as_the_first_filter:
+            self._observation_filters[observation_name].move_to_end(filter_name, last=False)
+
+    def add_reward_filter(self, filter_name: str, filter: 'RewardFilter', add_as_the_first_filter: bool=False):
+        """
+        Add a reward filter to the filters list
+        :param filter_name: the filter name
+        :param filter: the filter to add
+        :param add_as_the_first_filter: add the filter to the top of the filters stack
+        :return: None
+        """
+        self._reward_filters[filter_name] = filter
+        if add_as_the_first_filter:
+            self._reward_filters.move_to_end(filter_name, last=False)
+
+    def remove_observation_filter(self, observation_name: str, filter_name: str) -> None:
+        """
+        Remove an observation filter from the filters list
+        :param observation_name: the name of the observation to apply to
+        :param filter_name: the filter name
+        :return: None
+        """
+        del self._observation_filters[observation_name][filter_name]
+
+    def remove_reward_filter(self, filter_name: str) -> None:
+        """
+        Remove a reward filter from the filters list
+        :param filter_name: the filter name
+        :return: None
+        """
+        del self._reward_filters[filter_name]
+
+
+class NoInputFilter(InputFilter):
+    """
+    Creates an empty input filter. Used only for readability when creating the presets
+    """
+    def __init__(self):
+        super().__init__(is_a_reference_filter=False)
+
+
diff --git a/rl_coach/filters/observation/__init__.py b/rl_coach/filters/observation/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/filters/observation/observation_clipping_filter.py b/rl_coach/filters/observation/observation_clipping_filter.py
new file mode 100644
index 0000000..ff6d9bf
--- /dev/null
+++ b/rl_coach/filters/observation/observation_clipping_filter.py
@@ -0,0 +1,44 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+
+import numpy as np
+from rl_coach.spaces import ObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+class ObservationClippingFilter(ObservationFilter):
+    """
+    Clip the observation values using the given ranges
+    """
+    def __init__(self, clipping_low: float=-np.inf, clipping_high: float=np.inf):
+        """
+        :param clipping_low: The minimum value to allow after normalizing the observation
+        :param clipping_high: The maximum value to allow after normalizing the observation
+        """
+        super().__init__()
+        self.clip_min = clipping_low
+        self.clip_max = clipping_high
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+        observation = np.clip(observation, self.clip_min, self.clip_max)
+
+        return observation
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        return input_observation_space
diff --git a/rl_coach/filters/observation/observation_crop_filter.py b/rl_coach/filters/observation/observation_crop_filter.py
new file mode 100644
index 0000000..d702011
--- /dev/null
+++ b/rl_coach/filters/observation/observation_crop_filter.py
@@ -0,0 +1,92 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+from typing import Union, Tuple
+
+import numpy as np
+from rl_coach.spaces import ObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+class ObservationCropFilter(ObservationFilter):
+    """
+    Crops the current state observation to a given shape
+    """
+    def __init__(self, crop_low: np.ndarray=None, crop_high: np.ndarray=None):
+        """
+        :param crop_low: a vector where each dimension describes the start index for cropping the observation in the
+                         corresponding dimension. a negative value of -1 will be mapped to the max size
+        :param crop_high: a vector where each dimension describes the end index for cropping the observation in the
+                          corresponding dimension. a negative value of -1 will be mapped to the max size
+        """
+        super().__init__()
+        if crop_low is None and crop_high is None:
+            raise ValueError("At least one of crop_low and crop_high should be set to a real value. ")
+        if crop_low is None:
+            crop_low = np.array([0] * len(crop_high))
+        if crop_high is None:
+            crop_high = np.array([-1] * len(crop_low))
+
+        self.crop_low = crop_low
+        self.crop_high = crop_high
+
+        for h, l in zip(crop_high, crop_low):
+            if h < l and h != -1:
+                raise ValueError("Some of the cropping low values are higher than cropping high values")
+        if np.any(crop_high < -1) or np.any(crop_low < -1):
+            raise ValueError("Cropping values cannot be negative")
+        if crop_low.shape != crop_high.shape:
+            raise ValueError("The low values and high values for cropping must have the same number of dimensions")
+        if crop_low.dtype != int or crop_high.dtype != int:
+            raise ValueError("The crop values should be int values, instead they are defined as: {} and {}"
+                             .format(crop_low.dtype, crop_high.dtype))
+
+    def _replace_negative_one_in_crop_size(self, crop_size: np.ndarray, observation_shape: Union[Tuple, np.ndarray]):
+        # replace -1 with the max size
+        crop_size = crop_size.copy()
+        for i in range(len(observation_shape)):
+            if crop_size[i] == -1:
+                crop_size[i] = observation_shape[i]
+        return crop_size
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        crop_high = self._replace_negative_one_in_crop_size(self.crop_high, input_observation_space.shape)
+        crop_low = self._replace_negative_one_in_crop_size(self.crop_low, input_observation_space.shape)
+        if np.any(crop_high > input_observation_space.shape) or \
+                np.any(crop_low > input_observation_space.shape):
+            raise ValueError("The cropping values are outside of the observation space")
+        if not input_observation_space.is_point_in_space_shape(crop_low) or \
+                not input_observation_space.is_point_in_space_shape(crop_high - 1):
+            raise ValueError("The cropping indices are outside of the observation space")
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+        # replace -1 with the max size
+        crop_high = self._replace_negative_one_in_crop_size(self.crop_high, observation.shape)
+        crop_low = self._replace_negative_one_in_crop_size(self.crop_low, observation.shape)
+
+        # crop
+        indices = [slice(i, j) for i, j in zip(crop_low, crop_high)]
+        observation = observation[indices]
+        return observation
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        # replace -1 with the max size
+        crop_high = self._replace_negative_one_in_crop_size(self.crop_high, input_observation_space.shape)
+        crop_low = self._replace_negative_one_in_crop_size(self.crop_low, input_observation_space.shape)
+
+        input_observation_space.shape = crop_high - crop_low
+        return input_observation_space
diff --git a/rl_coach/filters/observation/observation_filter.py b/rl_coach/filters/observation/observation_filter.py
new file mode 100644
index 0000000..dd76128
--- /dev/null
+++ b/rl_coach/filters/observation/observation_filter.py
@@ -0,0 +1,40 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from rl_coach.filters.filter import Filter
+from rl_coach.spaces import ObservationSpace
+
+
+class ObservationFilter(Filter):
+    def __init__(self):
+        super().__init__()
+        self.supports_batching = False
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        """
+        This function should contain the logic for getting the filtered observation space
+        :param input_observation_space: the input observation space
+        :return: the filtered observation space
+        """
+        return input_observation_space
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        """
+        A function that implements validation of the input observation space
+        :param input_observation_space: the input observation space
+        :return: None
+        """
+        pass
\ No newline at end of file
diff --git a/rl_coach/filters/observation/observation_move_axis_filter.py b/rl_coach/filters/observation/observation_move_axis_filter.py
new file mode 100644
index 0000000..378caae
--- /dev/null
+++ b/rl_coach/filters/observation/observation_move_axis_filter.py
@@ -0,0 +1,62 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import numpy as np
+from rl_coach.spaces import ObservationSpace, PlanarMapsObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+class ObservationMoveAxisFilter(ObservationFilter):
+    """
+    Move an axis of the observation to a different place.
+    """
+    def __init__(self, axis_origin: int = None, axis_target: int=None):
+        super().__init__()
+        self.axis_origin = axis_origin
+        self.axis_target = axis_target
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        shape = input_observation_space.shape
+        if not -len(shape) <= self.axis_origin < len(shape) or not -len(shape) <= self.axis_target < len(shape):
+            raise ValueError("The given axis does not exist in the context of the input observation shape. ")
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+        return np.moveaxis(observation, self.axis_origin, self.axis_target)
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        axis_size = input_observation_space.shape[self.axis_origin]
+        input_observation_space.shape = np.delete(input_observation_space.shape, self.axis_origin)
+        if self.axis_target == -1:
+            input_observation_space.shape = np.append(input_observation_space.shape, axis_size)
+        elif self.axis_target < -1:
+            input_observation_space.shape = np.insert(input_observation_space.shape, self.axis_target+1, axis_size)
+        else:
+            input_observation_space.shape = np.insert(input_observation_space.shape, self.axis_target, axis_size)
+
+        # move the channels axis according to the axis change
+        if isinstance(input_observation_space, PlanarMapsObservationSpace):
+            if input_observation_space.channels_axis == self.axis_origin:
+                input_observation_space.channels_axis = self.axis_target
+            elif input_observation_space.channels_axis == self.axis_target:
+                input_observation_space.channels_axis = self.axis_origin
+            elif self.axis_origin < input_observation_space.channels_axis < self.axis_target:
+                input_observation_space.channels_axis -= 1
+            elif self.axis_target < input_observation_space.channels_axis < self.axis_origin:
+                input_observation_space.channels_axis += 1
+
+        return input_observation_space
diff --git a/rl_coach/filters/observation/observation_normalization_filter.py b/rl_coach/filters/observation/observation_normalization_filter.py
new file mode 100644
index 0000000..479f8e7
--- /dev/null
+++ b/rl_coach/filters/observation/observation_normalization_filter.py
@@ -0,0 +1,73 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+from typing import List
+
+import numpy as np
+from rl_coach.spaces import ObservationSpace
+
+from rl_coach.architectures.tensorflow_components.shared_variables import SharedRunningStats
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+class ObservationNormalizationFilter(ObservationFilter):
+    """
+    Normalize the observation with a running standard deviation and mean of the observations seen so far
+    If there is more than a single worker, the statistics of the observations are shared between all the workers
+    """
+    def __init__(self, clip_min: float=-5.0, clip_max: float=5.0, name='observation_stats'):
+        """
+        :param clip_min: The minimum value to allow after normalizing the observation
+        :param clip_max: The maximum value to allow after normalizing the observation
+        """
+        super().__init__()
+        self.clip_min = clip_min
+        self.clip_max = clip_max
+        self.running_observation_stats = None
+        self.name = name
+        self.supports_batching = True
+        self.observation_space = None
+
+    def set_device(self, device) -> None:
+        """
+        An optional function that allows the filter to get the device if it is required to use tensorflow ops
+        :param device: the device to use
+        :return: None
+        """
+        self.running_observation_stats = SharedRunningStats(device, name=self.name, create_ops=False)
+
+    def set_session(self, sess) -> None:
+        """
+        An optional function that allows the filter to get the session if it is required to use tensorflow ops
+        :param sess: the session
+        :return: None
+        """
+        self.running_observation_stats.set_session(sess)
+
+    def filter(self, observations: List[ObservationType], update_internal_state: bool=True) -> ObservationType:
+        observations = np.array(observations)
+        if update_internal_state:
+            self.running_observation_stats.push(observations)
+            self.last_mean = self.running_observation_stats.mean
+            self.last_stdev = self.running_observation_stats.std
+
+        # TODO: make sure that a batch is given here
+        return self.running_observation_stats.normalize(observations)
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        self.running_observation_stats.create_ops(shape=input_observation_space.shape,
+                                                  clip_values=(self.clip_min, self.clip_max))
+        return input_observation_space
diff --git a/rl_coach/filters/observation/observation_reduction_by_sub_parts_name_filter.py b/rl_coach/filters/observation/observation_reduction_by_sub_parts_name_filter.py
new file mode 100644
index 0000000..61a9b17
--- /dev/null
+++ b/rl_coach/filters/observation/observation_reduction_by_sub_parts_name_filter.py
@@ -0,0 +1,76 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+import copy
+from enum import Enum
+from typing import List
+
+from rl_coach.spaces import ObservationSpace, VectorObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+class ObservationReductionBySubPartsNameFilter(ObservationFilter):
+    """
+    Choose sub parts of the observation to remove or keep using their name.
+    This is useful when the environment has a measurements vector as observation which includes several different
+    measurements, but you want the agent to only see some of the measurements and not all.
+    This will currently work only for VectorObservationSpace observations
+    """
+    class ReductionMethod(Enum):
+        Keep = 0
+        Discard = 1
+
+    def __init__(self, part_names: List[str], reduction_method: ReductionMethod):
+        """
+        :param part_names: A list of part names to reduce
+        :param reduction_method: A reduction method to use - keep or discard the given parts
+        """
+        super().__init__()
+        self.part_names = part_names
+        self.reduction_method = reduction_method
+        self.measurement_names = None
+        self.indices_to_keep = None
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+        if self.indices_to_keep is None:
+            raise ValueError("To use ObservationReductionBySubPartsNameFilter, the get_filtered_observation_space "
+                             "function should be called before filtering an observation")
+        observation = observation[..., self.indices_to_keep]
+        return observation
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        if not isinstance(input_observation_space, VectorObservationSpace):
+            raise ValueError("The ObservationReductionBySubPartsNameFilter support only VectorObservationSpace "
+                             "observations. The given observation space was: {}"
+                             .format(input_observation_space.__class__))
+
+    def get_filtered_observation_space(self, input_observation_space: VectorObservationSpace) -> ObservationSpace:
+        self.measurement_names = copy.copy(input_observation_space.measurements_names)
+
+        if self.reduction_method == self.ReductionMethod.Keep:
+            input_observation_space.shape[-1] = len(self.part_names)
+            self.indices_to_keep = [idx for idx, val in enumerate(self.measurement_names) if val in self.part_names]
+            input_observation_space.measurements_names = copy.copy(self.part_names)
+        elif self.reduction_method == self.ReductionMethod.Discard:
+            input_observation_space.shape[-1] -= len(self.part_names)
+            self.indices_to_keep = [idx for idx, val in enumerate(self.measurement_names) if val not in self.part_names]
+            input_observation_space.measurements_names = [val for val in input_observation_space.measurements_names if
+                                                          val not in self.part_names]
+        else:
+            raise ValueError("The given reduction method is not supported")
+
+        return input_observation_space
diff --git a/rl_coach/filters/observation/observation_rescale_size_by_factor_filter.py b/rl_coach/filters/observation/observation_rescale_size_by_factor_filter.py
new file mode 100644
index 0000000..7ece865
--- /dev/null
+++ b/rl_coach/filters/observation/observation_rescale_size_by_factor_filter.py
@@ -0,0 +1,72 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from enum import Enum
+
+import scipy.ndimage
+from rl_coach.spaces import ObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+# imresize interpolation types as defined by scipy here:
+# https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.misc.imresize.html
+class RescaleInterpolationType(Enum):
+    NEAREST = 'nearest'
+    LANCZOS = 'lanczos'
+    BILINEAR = 'bilinear'
+    BICUBIC = 'bicubic'
+    CUBIC = 'cubic'
+
+
+class ObservationRescaleSizeByFactorFilter(ObservationFilter):
+    """
+    Scales the current state observation size by a given factor
+    Warning: this requires the input observation to be of type uint8 due to scipy requirements!
+    """
+    def __init__(self, rescale_factor: float, rescaling_interpolation_type: RescaleInterpolationType):
+        """
+        :param rescale_factor: the factor by which the observation will be rescaled
+        :param rescaling_interpolation_type: the interpolation type for rescaling
+        """
+        super().__init__()
+        self.rescale_factor = float(rescale_factor)  # scipy requires float scale factors
+        self.rescaling_interpolation_type = rescaling_interpolation_type
+        # TODO: allow selecting the channels dim
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        if not 2 <= input_observation_space.num_dimensions <= 3:
+            raise ValueError("The rescale filter only applies to image observations where the number of dimensions is"
+                             "either 2 (grayscale) or 3 (RGB). The number of dimensions defined for the "
+                             "output observation was {}".format(input_observation_space.num_dimensions))
+        if input_observation_space.num_dimensions == 3 and input_observation_space.shape[-1] != 3:
+            raise ValueError("Observations with 3 dimensions must have 3 channels in the last axis (RGB)")
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+        # scipy works only with uint8
+        observation = observation.astype('uint8')
+
+        # rescale
+        observation = scipy.misc.imresize(observation,
+                                          self.rescale_factor,
+                                          interp=self.rescaling_interpolation_type.value)
+
+        return observation
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        input_observation_space.shape[:2] = (input_observation_space.shape[:2] * self.rescale_factor).astype('int')
+        return input_observation_space
diff --git a/rl_coach/filters/observation/observation_rescale_to_size_filter.py b/rl_coach/filters/observation/observation_rescale_to_size_filter.py
new file mode 100644
index 0000000..4ed559f
--- /dev/null
+++ b/rl_coach/filters/observation/observation_rescale_to_size_filter.py
@@ -0,0 +1,98 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+from enum import Enum
+
+import numpy as np
+import scipy.ndimage
+from rl_coach.spaces import ObservationSpace, PlanarMapsObservationSpace, ImageObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+# imresize interpolation types as defined by scipy here:
+# https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.misc.imresize.html
+class RescaleInterpolationType(Enum):
+    NEAREST = 'nearest'
+    LANCZOS = 'lanczos'
+    BILINEAR = 'bilinear'
+    BICUBIC = 'bicubic'
+    CUBIC = 'cubic'
+
+
+class ObservationRescaleToSizeFilter(ObservationFilter):
+    """
+    Scales the current state observation to a given shape
+    Warning: this requires the input observation to be of type uint8 due to scipy requirements!
+    """
+    def __init__(self, output_observation_space: PlanarMapsObservationSpace,
+                 rescaling_interpolation_type: RescaleInterpolationType=RescaleInterpolationType.BILINEAR):
+        """
+        :param output_observation_space: the output observation space
+        :param rescaling_interpolation_type: the interpolation type for rescaling
+        """
+        super().__init__()
+        self.output_observation_space = output_observation_space
+        self.rescaling_interpolation_type = rescaling_interpolation_type
+
+        if not isinstance(output_observation_space, PlanarMapsObservationSpace):
+            raise ValueError("The rescale filter only applies to observation spaces that inherit from "
+                             "PlanarMapsObservationSpace. This includes observations which consist of a set of 2D "
+                             "images or an RGB image. Instead the output observation space was defined as: {}"
+                             .format(output_observation_space.__class__))
+
+        self.planar_map_output_shape = copy.copy(self.output_observation_space.shape)
+        self.planar_map_output_shape = np.delete(self.planar_map_output_shape,
+                                                 self.output_observation_space.channels_axis)
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        if not isinstance(input_observation_space, PlanarMapsObservationSpace):
+            raise ValueError("The rescale filter only applies to observation spaces that inherit from "
+                             "PlanarMapsObservationSpace. This includes observations which consist of a set of 2D "
+                             "images or an RGB image. Instead the input observation space was defined as: {}"
+                             .format(input_observation_space.__class__))
+        if input_observation_space.shape[input_observation_space.channels_axis] \
+                != self.output_observation_space.shape[self.output_observation_space.channels_axis]:
+            raise ValueError("The number of channels between the input and output observation spaces must match. "
+                             "Instead the number of channels were: {}, {}"
+                             .format(input_observation_space.shape[input_observation_space.channels_axis],
+                             self.output_observation_space.shape[self.output_observation_space.channels_axis]))
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+        # scipy works only with uint8
+        observation = observation.astype('uint8')
+
+        # rescale
+        if isinstance(self.output_observation_space, ImageObservationSpace):
+            observation = scipy.misc.imresize(observation,
+                                              tuple(self.output_observation_space.shape),
+                                              interp=self.rescaling_interpolation_type.value)
+        else:
+            new_observation = []
+            for i in range(self.output_observation_space.shape[self.output_observation_space.channels_axis]):
+                new_observation.append(scipy.misc.imresize(observation.take(i, self.output_observation_space.channels_axis),
+                                                  tuple(self.planar_map_output_shape),
+                                                  interp=self.rescaling_interpolation_type.value))
+            new_observation = np.array(new_observation)
+            observation = new_observation.swapaxes(0, self.output_observation_space.channels_axis)
+
+        return observation
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        input_observation_space.shape = self.output_observation_space.shape
+        return input_observation_space
diff --git a/rl_coach/filters/observation/observation_rgb_to_y_filter.py b/rl_coach/filters/observation/observation_rgb_to_y_filter.py
new file mode 100644
index 0000000..82ebb0a
--- /dev/null
+++ b/rl_coach/filters/observation/observation_rgb_to_y_filter.py
@@ -0,0 +1,50 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from rl_coach.spaces import ObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+class ObservationRGBToYFilter(ObservationFilter):
+    """
+    Converts the observation in the current state to gray scale (Y channel).
+    The channels axis is assumed to be the last axis
+    """
+    def __init__(self):
+        super().__init__()
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        if input_observation_space.num_dimensions != 3:
+            raise ValueError("The rescale filter only applies to image observations where the number of dimensions is"
+                             "3 (RGB). The number of dimensions defined for the input observation was {}"
+                             .format(input_observation_space.num_dimensions))
+        if input_observation_space.shape[-1] != 3:
+            raise ValueError("The observation space is expected to have 3 channels in the 1st dimension. The number of "
+                             "dimensions received is {}".format(input_observation_space.shape[-1]))
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+
+        # rgb to y
+        r, g, b = observation[:, :, 0], observation[:, :, 1], observation[:, :, 2]
+        observation = 0.2989 * r + 0.5870 * g + 0.1140 * b
+
+        return observation
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        input_observation_space.shape = input_observation_space.shape[:-1]
+        return input_observation_space
diff --git a/rl_coach/filters/observation/observation_squeeze_filter.py b/rl_coach/filters/observation/observation_squeeze_filter.py
new file mode 100644
index 0000000..df258b1
--- /dev/null
+++ b/rl_coach/filters/observation/observation_squeeze_filter.py
@@ -0,0 +1,46 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import numpy as np
+from rl_coach.spaces import ObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+class ObservationSqueezeFilter(ObservationFilter):
+    """
+    Squeezes the observation so to eliminate redundant axes.
+    """
+    def __init__(self, axis: int = None):
+        super().__init__()
+        self.axis = axis
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        if self.axis is None:
+            return
+
+        shape = input_observation_space.shape
+        if self.axis >= len(shape) or self.axis < -len(shape):
+            raise ValueError("The given axis does not exist in the context of the input observation shape. ")
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+        return observation.squeeze(axis=self.axis)
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        dummy_tensor = np.random.rand(*tuple(input_observation_space.shape))
+        input_observation_space.shape = dummy_tensor.squeeze(axis=self.axis).shape
+        return input_observation_space
diff --git a/rl_coach/filters/observation/observation_stacking_filter.py b/rl_coach/filters/observation/observation_stacking_filter.py
new file mode 100644
index 0000000..f6f5f83
--- /dev/null
+++ b/rl_coach/filters/observation/observation_stacking_filter.py
@@ -0,0 +1,105 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+from collections import deque
+
+import numpy as np
+from rl_coach.spaces import ObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+class LazyStack(object):
+    """
+    A lazy version of np.stack which avoids copying the memory until it is
+    needed.
+    """
+
+    def __init__(self, history, axis=None):
+        self.history = copy.copy(history)
+        self.axis = axis
+
+    def __array__(self, dtype=None):
+        array = np.stack(self.history, axis=self.axis)
+        if dtype is not None:
+            array = array.astype(dtype)
+        return array
+
+
+class ObservationStackingFilter(ObservationFilter):
+    """
+    Stack the current state observation on top of several previous observations.
+    This filter is stateful since it stores the previous step result and depends on it.
+    The filter adds an additional dimension to the output observation.
+
+    Warning!!! The filter replaces the observation with a LazyStack object, so no filters should be
+    applied after this filter. applying more filters will cause the LazyStack object to be converted to a numpy array
+    and increase the memory footprint.
+    """
+    def __init__(self, stack_size: int, stacking_axis: int=-1):
+        """
+        :param stack_size: the number of previous observations in the stack
+        :param stacking_axis: the axis on which to stack the observation on
+        """
+        super().__init__()
+        self.stack_size = stack_size
+        self.stacking_axis = stacking_axis
+        self.stack = []
+
+        if stack_size <= 0:
+            raise ValueError("The stack shape must be a positive number")
+        if type(stack_size) != int:
+            raise ValueError("The stack shape must be of int type")
+
+    @property
+    def next_filter(self) -> 'InputFilter':
+        return self._next_filter
+
+    @next_filter.setter
+    def next_filter(self, val: 'InputFilter'):
+        raise ValueError("ObservationStackingFilter can have no other filters after it since they break its "
+                         "functionality")
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        if len(self.stack) > 0 and not input_observation_space.val_matches_space_definition(self.stack[-1]):
+            raise ValueError("The given input observation space is different than the observations already stored in"
+                             "the filters memory")
+        if input_observation_space.num_dimensions <= self.stacking_axis:
+            raise ValueError("The stacking axis is larger than the number of dimensions in the observation space")
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+
+        if len(self.stack) == 0:
+            self.stack = deque([observation] * self.stack_size, maxlen=self.stack_size)
+        else:
+            if update_internal_state:
+                self.stack.append(observation)
+        observation = LazyStack(self.stack, self.stacking_axis)
+
+        return observation
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        if self.stacking_axis == -1:
+            input_observation_space.shape = np.append(input_observation_space.shape, values=[self.stack_size], axis=0)
+        else:
+            input_observation_space.shape = np.insert(input_observation_space.shape, obj=self.stacking_axis,
+                                                     values=[self.stack_size], axis=0)
+        return input_observation_space
+
+    def reset(self) -> None:
+        self.stack = []
diff --git a/rl_coach/filters/observation/observation_to_uint8_filter.py b/rl_coach/filters/observation/observation_to_uint8_filter.py
new file mode 100644
index 0000000..057167b
--- /dev/null
+++ b/rl_coach/filters/observation/observation_to_uint8_filter.py
@@ -0,0 +1,60 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import numpy as np
+from rl_coach.spaces import ObservationSpace
+
+from rl_coach.core_types import ObservationType
+from rl_coach.filters.observation.observation_filter import ObservationFilter
+
+
+class ObservationToUInt8Filter(ObservationFilter):
+    """
+    Converts the observation values to be uint8 values between 0 and 255.
+    It first scales the observation values to fit in the range and then converts them to uint8.
+    """
+    def __init__(self, input_low: float, input_high: float):
+        super().__init__()
+        self.input_low = input_low
+        self.input_high = input_high
+
+        if input_high <= input_low:
+            raise ValueError("The input observation space high values can be less or equal to the input observation "
+                             "space low values")
+
+    def validate_input_observation_space(self, input_observation_space: ObservationSpace):
+        if np.all(input_observation_space.low != self.input_low) or \
+                np.all(input_observation_space.high != self.input_high):
+            raise ValueError("The observation space values range don't match the configuration of the filter."
+                             "The configuration is: low = {}, high = {}. The actual values are: low = {}, high = {}"
+                             .format(self.input_low, self.input_high,
+                                     input_observation_space.low, input_observation_space.high))
+
+    def filter(self, observation: ObservationType, update_internal_state: bool=True) -> ObservationType:
+        # scale to 0-1
+        observation = (observation - self.input_low) / (self.input_high - self.input_low)
+
+        # scale to 0-255
+        observation *= 255
+
+        observation = observation.astype('uint8')
+
+        return observation
+
+    def get_filtered_observation_space(self, input_observation_space: ObservationSpace) -> ObservationSpace:
+        input_observation_space.low = 0
+        input_observation_space.high = 255
+        return input_observation_space
diff --git a/rl_coach/filters/reward/__init__.py b/rl_coach/filters/reward/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/filters/reward/reward_clipping_filter.py b/rl_coach/filters/reward/reward_clipping_filter.py
new file mode 100644
index 0000000..77d3202
--- /dev/null
+++ b/rl_coach/filters/reward/reward_clipping_filter.py
@@ -0,0 +1,53 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import numpy as np
+from rl_coach.spaces import RewardSpace
+
+from rl_coach.core_types import RewardType
+from rl_coach.filters.reward.reward_filter import RewardFilter
+
+
+class RewardClippingFilter(RewardFilter):
+    """
+    Clips the reward to some range
+    """
+    def __init__(self, clipping_low: float=-np.inf, clipping_high: float=np.inf):
+        """
+        :param clipping_low: The low threshold for reward clipping
+        :param clipping_high: The high threshold for reward clipping
+        """
+        super().__init__()
+        self.clipping_low = clipping_low
+        self.clipping_high = clipping_high
+
+        if clipping_low > clipping_high:
+            raise ValueError("The reward clipping low must be lower than the reward clipping max")
+
+    def filter(self, reward: RewardType, update_internal_state: bool=True) -> RewardType:
+        reward = float(reward)
+
+        if self.clipping_high:
+            reward = min(reward, self.clipping_high)
+        if self.clipping_low:
+            reward = max(reward, self.clipping_low)
+
+        return reward
+
+    def get_filtered_reward_space(self, input_reward_space: RewardSpace) -> RewardSpace:
+        input_reward_space.high = min(self.clipping_high, input_reward_space.high)
+        input_reward_space.low = max(self.clipping_low, input_reward_space.low)
+        return input_reward_space
diff --git a/rl_coach/filters/reward/reward_filter.py b/rl_coach/filters/reward/reward_filter.py
new file mode 100644
index 0000000..d105b8b
--- /dev/null
+++ b/rl_coach/filters/reward/reward_filter.py
@@ -0,0 +1,31 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from rl_coach.filters.filter import Filter
+from rl_coach.spaces import RewardSpace
+
+
+class RewardFilter(Filter):
+    def __init__(self):
+        super().__init__()
+
+    def get_filtered_reward_space(self, input_reward_space: RewardSpace) -> RewardSpace:
+        """
+        This function should contain the logic for getting the filtered reward space
+        :param input_reward_space: the input reward space
+        :return: the filtered reward space
+        """
+        return input_reward_space
\ No newline at end of file
diff --git a/rl_coach/filters/reward/reward_normalization_filter.py b/rl_coach/filters/reward/reward_normalization_filter.py
new file mode 100644
index 0000000..ebb5967
--- /dev/null
+++ b/rl_coach/filters/reward/reward_normalization_filter.py
@@ -0,0 +1,68 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+
+import numpy as np
+from rl_coach.spaces import RewardSpace
+
+from rl_coach.architectures.tensorflow_components.shared_variables import SharedRunningStats
+from rl_coach.core_types import RewardType
+from rl_coach.filters.reward.reward_filter import RewardFilter
+
+
+class RewardNormalizationFilter(RewardFilter):
+    """
+    Normalize the reward with a running standard deviation and mean of the rewards seen so far
+    If there is more than a single worker, the statistics of the rewards are shared between all the workers
+    """
+    def __init__(self, clip_min: float=-5.0, clip_max: float=5.0):
+        """
+        :param clip_min: The minimum value to allow after normalizing the reward
+        :param clip_max: The maximum value to allow after normalizing the reward
+        """
+        super().__init__()
+        self.clip_min = clip_min
+        self.clip_max = clip_max
+        self.running_rewards_stats = None
+
+    def set_device(self, device) -> None:
+        """
+        An optional function that allows the filter to get the device if it is required to use tensorflow ops
+        :param device: the device to use
+        :return: None
+        """
+        self.running_rewards_stats = SharedRunningStats(device, name='rewards_stats')
+
+    def set_session(self, sess) -> None:
+        """
+        An optional function that allows the filter to get the session if it is required to use tensorflow ops
+        :param sess: the session
+        :return: None
+        """
+        self.running_rewards_stats.set_session(sess)
+
+    def filter(self, reward: RewardType, update_internal_state: bool=True) -> RewardType:
+        if update_internal_state:
+            self.running_rewards_stats.push(reward)
+
+        reward = (reward - self.running_rewards_stats.mean) / \
+                      (self.running_rewards_stats.std + 1e-15)
+        reward = np.clip(reward, self.clip_min, self.clip_max)
+
+        return reward
+
+    def get_filtered_reward_space(self, input_reward_space: RewardSpace) -> RewardSpace:
+        return input_reward_space
diff --git a/rl_coach/filters/reward/reward_rescale_filter.py b/rl_coach/filters/reward/reward_rescale_filter.py
new file mode 100644
index 0000000..a8530d7
--- /dev/null
+++ b/rl_coach/filters/reward/reward_rescale_filter.py
@@ -0,0 +1,44 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from rl_coach.spaces import RewardSpace
+
+from rl_coach.core_types import RewardType
+from rl_coach.filters.reward.reward_filter import RewardFilter
+
+
+class RewardRescaleFilter(RewardFilter):
+    """
+    Rescales the reward by multiplying with some factor
+    """
+    def __init__(self, rescale_factor: float):
+        """
+        :param rescale_factor: The reward rescaling factor by which the reward will be multiplied
+        """
+        super().__init__()
+        self.rescale_factor = rescale_factor
+
+        if rescale_factor == 0:
+            raise ValueError("The reward rescale value can not be set to 0")
+
+    def filter(self, reward: RewardType, update_internal_state: bool=True) -> RewardType:
+        reward = float(reward) * self.rescale_factor
+        return reward
+
+    def get_filtered_reward_space(self, input_reward_space: RewardSpace) -> RewardSpace:
+        input_reward_space.high = input_reward_space.high * self.rescale_factor
+        input_reward_space.low = input_reward_space.low * self.rescale_factor
+        return input_reward_space
diff --git a/rl_coach/graph_managers/README.md b/rl_coach/graph_managers/README.md
new file mode 100644
index 0000000..e0ff4a0
--- /dev/null
+++ b/rl_coach/graph_managers/README.md
@@ -0,0 +1,32 @@
+# Block Factory
+
+The block factory is a class which creates a block that fits into a specific RL scheme.
+Example RL schemes are: self play, multi agent, HRL, basic RL, etc.
+The block factory should create all the components of the block and return the block scheduler.
+The block factory will then be used to create different combinations of components.
+For example, an HRL factory can be later instantiated with:
+* env = Atari Breakout
+* master (top hierarchy level) agent = DDPG
+* slave (bottom hierarchy level) agent = DQN
+
+A custom block factory implementation should look as follows:
+
+```
+class CustomFactory(BlockFactory):
+    def __init__(self, custom_params):
+        super().__init__()
+
+    def _create_block(self, task_index: int, device=None) -> BlockScheduler:
+        """
+        Create all the block modules and the block scheduler
+        :param task_index: the index of the process on which the worker will be run
+        :return: the initialized block scheduler
+        """
+
+        # Create env
+        # Create composite agents
+        # Create level managers
+        # Create block scheduler
+
+        return block_scheduler
+```
\ No newline at end of file
diff --git a/rl_coach/graph_managers/__init__.py b/rl_coach/graph_managers/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/graph_managers/basic_rl_graph_manager.py b/rl_coach/graph_managers/basic_rl_graph_manager.py
new file mode 100644
index 0000000..cbbf9ca
--- /dev/null
+++ b/rl_coach/graph_managers/basic_rl_graph_manager.py
@@ -0,0 +1,60 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+from typing import Tuple, List
+
+from rl_coach.base_parameters import AgentParameters, VisualizationParameters, TaskParameters, PresetValidationParameters
+from rl_coach.environments.environment import EnvironmentParameters, Environment
+from rl_coach.level_manager import LevelManager
+
+from rl_coach.graph_managers.graph_manager import GraphManager, ScheduleParameters
+from rl_coach.utils import short_dynamic_import
+
+
+class BasicRLGraphManager(GraphManager):
+    """
+    A basic RL graph manager creates the common scheme of RL where there is a single agent which interacts with a
+    single environment.
+    """
+    def __init__(self, agent_params: AgentParameters, env_params: EnvironmentParameters,
+                 schedule_params: ScheduleParameters,
+                 vis_params: VisualizationParameters=VisualizationParameters(),
+                 preset_validation_params: PresetValidationParameters = PresetValidationParameters()):
+        super().__init__('simple_rl_graph', schedule_params, vis_params)
+        self.agent_params = agent_params
+        self.env_params = env_params
+        self.preset_validation_params = preset_validation_params
+
+        self.agent_params.visualization = vis_params
+        if self.agent_params.input_filter is None:
+            self.agent_params.input_filter = env_params.default_input_filter()
+        if self.agent_params.output_filter is None:
+            self.agent_params.output_filter = env_params.default_output_filter()
+
+    def _create_graph(self, task_parameters: TaskParameters) -> Tuple[List[LevelManager], List[Environment]]:
+        # environment loading
+        self.env_params.seed = task_parameters.seed
+        env = short_dynamic_import(self.env_params.path)(**self.env_params.__dict__,
+                                                         visualization_parameters=self.visualization_parameters)
+
+        # agent loading
+        self.agent_params.task_parameters = task_parameters  # TODO: this should probably be passed in a different way
+        self.agent_params.name = "agent"
+        agent = short_dynamic_import(self.agent_params.path)(self.agent_params)
+
+        # set level manager
+        level_manager = LevelManager(agents=agent, environment=env, name="main_level")
+
+        return [level_manager], [env]
diff --git a/rl_coach/graph_managers/graph_manager.py b/rl_coach/graph_managers/graph_manager.py
new file mode 100644
index 0000000..f398be4
--- /dev/null
+++ b/rl_coach/graph_managers/graph_manager.py
@@ -0,0 +1,492 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+import os
+import time
+from collections import OrderedDict
+from typing import List, Tuple
+from distutils.dir_util import copy_tree, remove_tree
+
+import numpy as np
+from rl_coach.base_parameters import iterable_to_items, TaskParameters, DistributedTaskParameters, VisualizationParameters, \
+    Parameters, PresetValidationParameters
+from rl_coach.core_types import TotalStepsCounter, RunPhase, PlayingStepsType, TrainingSteps, EnvironmentEpisodes, \
+    EnvironmentSteps, \
+    StepMethod
+from rl_coach.environments.environment import Environment
+from rl_coach.level_manager import LevelManager
+from rl_coach.utils import set_cpu
+
+from rl_coach.logger import screen, Logger
+
+
+class ScheduleParameters(Parameters):
+    def __init__(self):
+        super().__init__()
+        self.heatup_steps = None
+        self.evaluation_steps = None
+        self.steps_between_evaluation_periods = None
+        self.improve_steps = None
+
+
+class HumanPlayScheduleParameters(ScheduleParameters):
+    def __init__(self):
+        super().__init__()
+        self.heatup_steps = EnvironmentSteps(0)
+        self.evaluation_steps = EnvironmentEpisodes(0)
+        self.steps_between_evaluation_periods = EnvironmentEpisodes(100000000)
+        self.improve_steps = TrainingSteps(10000000000)
+
+
+class GraphManager(object):
+    """
+    A graph manager is responsible for creating and initializing a graph of agents, including all its internal
+    components. It is then used in order to schedule the execution of operations on the graph, such as acting and
+    training.
+    """
+    def __init__(self,
+                 name: str,
+                 schedule_params: ScheduleParameters,
+                 vis_params: VisualizationParameters = VisualizationParameters()):
+        self.sess = None
+        self.level_managers = []
+        self.top_level_manager = None
+        self.environments = []
+        self.heatup_steps = schedule_params.heatup_steps
+        self.evaluation_steps = schedule_params.evaluation_steps
+        self.steps_between_evaluation_periods = schedule_params.steps_between_evaluation_periods
+        self.improve_steps = schedule_params.improve_steps
+        self.visualization_parameters = vis_params
+        self.name = name
+        self.task_parameters = None
+        self._phase = self.phase = RunPhase.UNDEFINED
+        self.preset_validation_params = PresetValidationParameters()
+
+        # timers
+        self.graph_initialization_time = time.time()
+        self.heatup_start_time = None
+        self.training_start_time = None
+        self.last_evaluation_start_time = None
+        self.last_checkpoint_saving_time = time.time()
+
+        # counters
+        self.total_steps_counters = {
+            RunPhase.HEATUP: TotalStepsCounter(),
+            RunPhase.TRAIN: TotalStepsCounter(),
+            RunPhase.TEST: TotalStepsCounter()
+        }
+        self.checkpoint_id = 0
+
+        self.checkpoint_saver = None
+        self.graph_logger = Logger()
+
+    def create_graph(self, task_parameters: TaskParameters):
+        self.task_parameters = task_parameters
+
+        if isinstance(task_parameters, DistributedTaskParameters):
+            screen.log_title("Creating graph - name: {} task id: {} type: {}".format(self.__class__.__name__,
+                                                                                     task_parameters.task_index,
+                                                                                     task_parameters.job_type))
+        else:
+            screen.log_title("Creating graph - name: {}".format(self.__class__.__name__))
+
+        # "hide" the gpu if necessary
+        if task_parameters.use_cpu:
+            set_cpu()
+
+        # create a target server for the worker and a device
+        if isinstance(task_parameters, DistributedTaskParameters):
+            task_parameters.worker_target, task_parameters.device = \
+                self.create_worker_or_parameters_server(task_parameters=task_parameters)
+
+        # create the graph modules
+        self.level_managers, self.environments = self._create_graph(task_parameters)
+
+        # set self as the parent of all the level managers
+        self.top_level_manager = self.level_managers[0]
+        for level_manager in self.level_managers:
+            level_manager.parent_graph_manager = self
+
+        # create a session (it needs to be created after all the graph ops were created)
+        self.sess = None
+        self.create_session(task_parameters=task_parameters)
+
+        self._phase = self.phase = RunPhase.UNDEFINED
+
+        self.setup_logger()
+
+    def _create_graph(self, task_parameters: TaskParameters) -> Tuple[List[LevelManager], List[Environment]]:
+        """
+        Create all the graph modules and the graph scheduler
+        :param task_parameters: the parameters of the task
+        :return: the initialized level managers and environments
+        """
+        raise NotImplementedError("")
+
+    def create_worker_or_parameters_server(self, task_parameters: DistributedTaskParameters):
+        import tensorflow as tf
+        config = tf.ConfigProto()
+        config.allow_soft_placement = True  # allow placing ops on cpu if they are not fit for gpu
+        config.gpu_options.allow_growth = True  # allow the gpu memory allocated for the worker to grow if needed
+
+        from rl_coach.architectures.tensorflow_components.distributed_tf_utils import create_and_start_parameters_server, \
+            create_cluster_spec, create_worker_server_and_device
+
+        # create cluster spec
+        cluster_spec = create_cluster_spec(parameters_server=task_parameters.parameters_server_hosts,
+                                           workers=task_parameters.worker_hosts)
+
+        # create and start parameters server (non-returning function) or create a worker and a device setter
+        if task_parameters.job_type == "ps":
+            create_and_start_parameters_server(cluster_spec=cluster_spec,
+                                               config=config)
+        elif task_parameters.job_type == "worker":
+            return create_worker_server_and_device(cluster_spec=cluster_spec,
+                                                   task_index=task_parameters.task_index,
+                                                   use_cpu=task_parameters.use_cpu,
+                                                   config=config)
+        else:
+            raise ValueError("The job type should be either ps or worker and not {}"
+                             .format(task_parameters.job_type))
+
+    def create_session(self, task_parameters: DistributedTaskParameters):
+        import tensorflow as tf
+        config = tf.ConfigProto()
+        config.allow_soft_placement = True  # allow placing ops on cpu if they are not fit for gpu
+        config.gpu_options.allow_growth = True  # allow the gpu memory allocated for the worker to grow if needed
+        # config.gpu_options.per_process_gpu_memory_fraction = 0.2
+
+        if isinstance(task_parameters, DistributedTaskParameters):
+            # the distributed tensorflow setting
+            from rl_coach.architectures.tensorflow_components.distributed_tf_utils import create_monitored_session
+            if hasattr(self.task_parameters, 'checkpoint_restore_dir') and self.task_parameters.checkpoint_restore_dir:
+                checkpoint_dir = os.path.join(task_parameters.experiment_path, 'checkpoint')
+                if os.path.exists(checkpoint_dir):
+                    remove_tree(checkpoint_dir)
+                copy_tree(task_parameters.checkpoint_restore_dir, checkpoint_dir)
+            else:
+                checkpoint_dir = task_parameters.save_checkpoint_dir
+
+            self.sess = create_monitored_session(target=task_parameters.worker_target,
+                                                 task_index=task_parameters.task_index,
+                                                 checkpoint_dir=checkpoint_dir,
+                                                 save_checkpoint_secs=task_parameters.save_checkpoint_secs,
+                                                 config=config)
+            # set the session for all the modules
+            self.set_session(self.sess)
+        else:
+            self.variables_to_restore = tf.global_variables()
+            self.variables_to_restore = [v for v in self.variables_to_restore if '/online' in v.name]
+            self.checkpoint_saver = tf.train.Saver(self.variables_to_restore)
+
+            # regular session
+            self.sess = tf.Session(config=config)
+
+            # set the session for all the modules
+            self.set_session(self.sess)
+
+            # restore from checkpoint if given
+            self.restore_checkpoint()
+
+    def setup_logger(self) -> None:
+        # dump documentation
+        logger_prefix = "{graph_name}".format(graph_name=self.name)
+        self.graph_logger.set_logger_filenames(self.task_parameters.experiment_path, logger_prefix=logger_prefix,
+                                               add_timestamp=True, task_id=self.task_parameters.task_index)
+        if self.visualization_parameters.dump_parameters_documentation:
+            self.graph_logger.dump_documentation(str(self))
+        [manager.setup_logger() for manager in self.level_managers]
+
+    @property
+    def phase(self) -> RunPhase:
+        """
+        Get the phase of the graph
+        :return: the current phase
+        """
+        return self._phase
+
+    @phase.setter
+    def phase(self, val: RunPhase):
+        """
+        Change the phase of the graph and all the hierarchy levels below it
+        :param val: the new phase
+        :return: None
+        """
+        self._phase = val
+        for level_manager in self.level_managers:
+            level_manager.phase = val
+        for environment in self.environments:
+            environment.phase = val
+
+    def set_session(self, sess) -> None:
+        """
+        Set the deep learning framework session for all the modules in the graph
+        :return: None
+        """
+        [manager.set_session(sess) for manager in self.level_managers]
+
+    def heatup(self, steps: PlayingStepsType) -> None:
+        """
+        Perform heatup for several steps, which means taking random actions and storing the results in memory
+        :param steps: the number of steps as a tuple of steps time and steps count
+        :return: None
+        """
+        steps_copy = copy.copy(steps)
+
+        if steps_copy.num_steps > 0:
+            self.phase = RunPhase.HEATUP
+            screen.log_title("{}: Starting heatup".format(self.name))
+            self.heatup_start_time = time.time()
+
+            # reset all the levels before starting to heatup
+            self.reset_internal_state(force_environment_reset=True)
+
+            # act on the environment
+            while steps_copy.num_steps > 0:
+                steps_done, _ = self.act(steps_copy, continue_until_game_over=True, return_on_game_over=True)
+                steps_copy.num_steps -= steps_done
+
+            # training phase
+            self.phase = RunPhase.UNDEFINED
+
+    def handle_episode_ended(self) -> None:
+        """
+        End an episode and reset all the episodic parameters
+        :return: None
+        """
+        self.total_steps_counters[self.phase][EnvironmentEpisodes] += 1
+
+        # TODO: we should disentangle ending the episode from resetting the internal state
+        self.reset_internal_state()
+
+    def train(self, steps: TrainingSteps) -> None:
+        """
+        Perform several training iterations for all the levels in the hierarchy
+        :param steps: number of training iterations to perform
+        :return: None
+        """
+        # perform several steps of training interleaved with acting
+        count_end = self.total_steps_counters[RunPhase.TRAIN][TrainingSteps] + steps.num_steps
+        while self.total_steps_counters[RunPhase.TRAIN][TrainingSteps] < count_end:
+            self.total_steps_counters[RunPhase.TRAIN][TrainingSteps] += 1
+            losses = [manager.train() for manager in self.level_managers]
+            # self.loss.add_sample(loss)
+
+    def reset_internal_state(self, force_environment_reset=False) -> None:
+        """
+        Reset an episode for all the levels
+        :param force_environment_reset: force the environment to reset the episode even if it has some conditions that
+                                        tell it not to. for example, if ale life is lost, gym will tell the agent that
+                                        the episode is finished but won't actually reset the episode if there are more
+                                        lives available
+        :return: None
+        """
+        [environment.reset_internal_state(force_environment_reset) for environment in self.environments]
+        [manager.reset_internal_state() for manager in self.level_managers]
+
+    def act(self, steps: PlayingStepsType, return_on_game_over: bool=False, continue_until_game_over=False,
+            keep_networks_in_sync=False) -> (int, bool):
+        """
+        Do several steps of acting on the environment
+        :param steps: the number of steps as a tuple of steps time and steps count
+        :param return_on_game_over: finish acting if an episode is finished
+        :param continue_until_game_over: continue playing until an episode was completed
+        :param keep_networks_in_sync: sync the network parameters with the global network before each episode
+        :return: the actual number of steps done, a boolean value that represent if the episode was done when finishing
+                 the function call
+        """
+        # perform several steps of playing
+        result = None
+
+        hold_until_a_full_episode = True if continue_until_game_over else False
+        initial_count = self.total_steps_counters[self.phase][steps.__class__]
+        count_end = initial_count + steps.num_steps
+
+        # The assumption here is that the total_steps_counters are each updated when an event
+        #  takes place (i.e. an episode ends)
+        # TODO - The counter of frames is not updated correctly. need to fix that.
+        while self.total_steps_counters[self.phase][steps.__class__] < count_end or hold_until_a_full_episode:
+            current_steps = self.environments[0].total_steps_counter
+
+            result = self.top_level_manager.step(None)
+            # result will be None if at least one level_manager decided not to play (= all of his agents did not play)
+            # causing the rest of the level_managers down the stack not to play either, and thus the entire graph did
+            # not act
+            if result is None:
+                break
+
+            # add the diff between the total steps before and after stepping, such that environment initialization steps
+            # (like in Atari) will not be counted
+            self.total_steps_counters[self.phase][EnvironmentSteps] += \
+                self.environments[0].total_steps_counter - current_steps
+
+            if result.game_over:
+                hold_until_a_full_episode = False
+                self.handle_episode_ended()
+                if keep_networks_in_sync:
+                    self.sync_graph()
+                if return_on_game_over:
+                    return self.total_steps_counters[self.phase][EnvironmentSteps] - initial_count, True
+
+        # return the game over status
+        if result:
+            return self.total_steps_counters[self.phase][EnvironmentSteps] - initial_count, result.game_over
+        else:
+            return self.total_steps_counters[self.phase][EnvironmentSteps] - initial_count, False
+
+    def train_and_act(self, steps: StepMethod) -> None:
+        """
+        Train the agent by doing several acting steps followed by several training steps continually
+        :param steps: the number of steps as a tuple of steps time and steps count
+        :return: None
+        """
+        # perform several steps of training interleaved with acting
+        if steps.num_steps > 0:
+            self.phase = RunPhase.TRAIN
+            count_end = self.total_steps_counters[self.phase][steps.__class__] + steps.num_steps
+            self.reset_internal_state(force_environment_reset=True)
+            #TODO - the below while loop should end with full episodes, so to avoid situations where we have partial
+            #  episodes in memory
+            while self.total_steps_counters[self.phase][steps.__class__] < count_end:
+                # The actual steps being done on the environment are decided by the agents themselves.
+                # This is just an high-level controller.
+                self.act(EnvironmentSteps(1))
+                self.train(TrainingSteps(1))
+                self.save_checkpoint()
+            self.phase = RunPhase.UNDEFINED
+
+    def sync_graph(self) -> None:
+        """
+        Sync the global network parameters to the graph
+        :return:
+        """
+        [manager.sync() for manager in self.level_managers]
+
+    def evaluate(self, steps: PlayingStepsType, keep_networks_in_sync: bool=False) -> None:
+        """
+        Perform evaluation for several steps
+        :param steps: the number of steps as a tuple of steps time and steps count
+        :param keep_networks_in_sync: sync the network parameters with the global network before each episode
+        :return: None
+        """
+        if steps.num_steps > 0:
+            self.phase = RunPhase.TEST
+            self.last_evaluation_start_time = time.time()
+
+            # reset all the levels before starting to evaluate
+            self.reset_internal_state(force_environment_reset=True)
+            self.sync_graph()
+
+            count_end = self.total_steps_counters[self.phase][steps.__class__] + steps.num_steps
+            while self.total_steps_counters[self.phase][steps.__class__] < count_end:
+                steps_done, _ = self.act(steps, continue_until_game_over=True, return_on_game_over=True,
+                                         keep_networks_in_sync=keep_networks_in_sync)
+
+            self.phase = RunPhase.UNDEFINED
+
+    def restore_checkpoint(self):
+        # TODO: find better way to load checkpoints that were saved with a global network into the online network
+        if hasattr(self.task_parameters, 'checkpoint_restore_dir') and self.task_parameters.checkpoint_restore_dir:
+            import tensorflow as tf
+            checkpoint_dir = self.task_parameters.checkpoint_restore_dir
+            checkpoint = tf.train.get_checkpoint_state(checkpoint_dir)
+            screen.log_title("Loading checkpoint: {}".format(checkpoint.model_checkpoint_path))
+            variables = {}
+            for var_name, _ in tf.contrib.framework.list_variables(self.task_parameters.checkpoint_restore_dir):
+                # Load the variable
+                var = tf.contrib.framework.load_variable(checkpoint_dir, var_name)
+
+                # Set the new name
+                new_name = var_name
+                new_name = new_name.replace('global/', 'online/')
+                variables[new_name] = var
+
+            for v in self.variables_to_restore:
+                self.sess.run(v.assign(variables[v.name.split(':')[0]]))
+
+    def save_checkpoint(self):
+        # only the chief process saves checkpoints
+        if self.task_parameters.save_checkpoint_secs \
+                and time.time() - self.last_checkpoint_saving_time >= self.task_parameters.save_checkpoint_secs\
+                and self.task_parameters.task_index == 0:
+
+            checkpoint_path = os.path.join(self.task_parameters.save_checkpoint_dir,
+                                           "{}_Step-{}.ckpt".format(
+                                               self.checkpoint_id,
+                                               self.total_steps_counters[RunPhase.TRAIN][EnvironmentSteps]))
+            if not isinstance(self.task_parameters, DistributedTaskParameters):
+                saved_checkpoint_path = self.checkpoint_saver.save(self.sess, checkpoint_path)
+            else:
+                saved_checkpoint_path = checkpoint_path
+
+            # this is required in order for agents to save additional information like a DND for example
+            [manager.save_checkpoint(self.checkpoint_id) for manager in self.level_managers]
+
+            screen.log_dict(
+                OrderedDict([
+                    ("Saving in path", saved_checkpoint_path),
+                ]),
+                prefix="Checkpoint"
+            )
+
+            self.checkpoint_id += 1
+            self.last_checkpoint_saving_time = time.time()
+
+    def improve(self):
+        """
+        The main loop of the run.
+        Defined in the following steps:
+        1. Heatup
+        2. Repeat:
+            2.1. Repeat:
+                2.1.1. Act
+                2.1.2. Train
+                2.1.3. Possibly save checkpoint
+            2.2. Evaluate
+        :return: None
+        """
+
+        # initialize the network parameters from the global network
+        self.sync_graph()
+
+        # heatup
+        self.heatup(self.heatup_steps)
+
+        # improve
+        if self.task_parameters.task_index is not None:
+            screen.log_title("Starting to improve {} task index {}".format(self.name, self.task_parameters.task_index))
+        else:
+            screen.log_title("Starting to improve {}".format(self.name))
+        self.training_start_time = time.time()
+        count_end = self.improve_steps.num_steps
+        while self.total_steps_counters[RunPhase.TRAIN][self.improve_steps.__class__] < count_end:
+            self.train_and_act(self.steps_between_evaluation_periods)
+            self.evaluate(self.evaluation_steps)
+
+    def __str__(self):
+        result = ""
+        for key, val in self.__dict__.items():
+            params = ""
+            if isinstance(val, list) or isinstance(val, dict) or isinstance(val, OrderedDict):
+                items = iterable_to_items(val)
+                for k, v in items:
+                    params += "{}: {}\n".format(k, v)
+            else:
+                params = val
+            result += "{}: \n{}\n".format(key, params)
+
+        return result
diff --git a/rl_coach/graph_managers/hac_graph_manager.py b/rl_coach/graph_managers/hac_graph_manager.py
new file mode 100644
index 0000000..9fcd88a
--- /dev/null
+++ b/rl_coach/graph_managers/hac_graph_manager.py
@@ -0,0 +1,107 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+import numpy as np
+from typing import List, Union, Tuple
+
+from rl_coach.base_parameters import AgentParameters, VisualizationParameters, TaskParameters, PresetValidationParameters
+from rl_coach.environments.environment import EnvironmentParameters, Environment
+from rl_coach.level_manager import LevelManager
+from rl_coach.utils import short_dynamic_import
+
+from rl_coach.core_types import EnvironmentSteps
+from rl_coach.graph_managers.graph_manager import GraphManager, ScheduleParameters
+
+
+class HACGraphManager(GraphManager):
+    """
+    A simple HAC graph manager creates a deep hierarchy with a single agent per hierarchy level, and a single
+    environment (on the bottom layer) which is interacted with.
+    """
+    def __init__(self, agents_params: List[AgentParameters], env_params: EnvironmentParameters,
+                 schedule_params: ScheduleParameters, vis_params: VisualizationParameters,
+                 consecutive_steps_to_run_non_top_levels: Union[EnvironmentSteps, List[EnvironmentSteps]],
+                 preset_validation_params: PresetValidationParameters = PresetValidationParameters()):
+        """
+        :param agents_params: the parameters of all the agents in the hierarchy starting from the top level of the
+                              hierarchy to the bottom level
+        :param env_params: the parameters of the environment
+        :param schedule_params: the parameters for scheduling the graph
+        :param vis_params: the visualization parameters
+        :param consecutive_steps_to_run_non_top_levels: the number of time steps that each level is ran.
+            for example, when the top level gives the bottom level a goal, the bottom level can act for
+            consecutive_steps_to_run_each_level steps and try to reach that goal. This is expected to be either
+            an EnvironmentSteps which will be used for all levels, or an EnvironmentSteps for each level as a list.
+        """
+        super().__init__('hac_graph', schedule_params, vis_params)
+        self.agents_params = agents_params
+        self.env_params = env_params
+        self.preset_validation_params = preset_validation_params
+        self.should_test_current_sub_goal = None  # will be filled by the top level agent, and is used by all levels
+
+        if isinstance(consecutive_steps_to_run_non_top_levels, list):
+            if len(consecutive_steps_to_run_non_top_levels) != len(self.agents_params):
+                raise ValueError("If the consecutive_steps_to_run_each_level is given as a list, it should match "
+                                 "the number of levels in the hierarchy. Alternatively, it is possible to use a single "
+                                 "value for all the levels, by passing an EnvironmentSteps")
+        elif isinstance(consecutive_steps_to_run_non_top_levels, EnvironmentSteps):
+            self.consecutive_steps_to_run_non_top_levels = consecutive_steps_to_run_non_top_levels
+
+        for agent_params in agents_params:
+            agent_params.visualization = self.visualization_parameters
+            if agent_params.input_filter is None:
+                agent_params.input_filter = self.env_params.default_input_filter()
+            if agent_params.output_filter is None:
+                agent_params.output_filter = self.env_params.default_output_filter()
+
+        if len(self.agents_params) < 2:
+            raise ValueError("The HAC graph manager must receive the agent parameters for at least two levels of the "
+                             "hierarchy. Otherwise, use the basic RL graph manager.")
+
+    def _create_graph(self, task_parameters: TaskParameters) -> Tuple[List[LevelManager], List[Environment]]:
+        env = short_dynamic_import(self.env_params.path)(**self.env_params.__dict__,
+                                                         visualization_parameters=self.visualization_parameters)
+
+        for agent_params in self.agents_params:
+            agent_params.task_parameters = task_parameters
+
+        # we need to build the hierarchy in reverse order (from the bottom up) in order for the spaces of each level
+        # to be known
+        level_managers = []
+        current_env = env
+        # out_action_space = env.action_space
+        for level_idx, agent_params in reversed(list(enumerate(self.agents_params))):
+            agent_params.name = "agent_{}".format(level_idx)
+            agent_params.is_a_highest_level_agent = level_idx == 0
+            agent_params.is_a_lowest_level_agent = level_idx == len(self.agents_params) - 1
+
+            agent = short_dynamic_import(agent_params.path)(agent_params)
+
+            level_manager = LevelManager(
+                agents=agent,
+                environment=current_env,
+                real_environment=env,
+                steps_limit=EnvironmentSteps(1) if level_idx == 0
+                            else self.consecutive_steps_to_run_non_top_levels,
+                should_reset_agent_state_after_time_limit_passes=level_idx > 0,
+                name="level_{}".format(level_idx)
+            )
+            current_env = level_manager
+            level_managers.insert(0, level_manager)
+
+        return level_managers, [env]
+
+
+
diff --git a/rl_coach/graph_managers/hrl_graph_manager.py b/rl_coach/graph_managers/hrl_graph_manager.py
new file mode 100644
index 0000000..f8b81a5
--- /dev/null
+++ b/rl_coach/graph_managers/hrl_graph_manager.py
@@ -0,0 +1,117 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List, Union, Tuple
+
+from rl_coach.base_parameters import AgentParameters, VisualizationParameters, TaskParameters, PresetValidationParameters
+from rl_coach.environments.environment import EnvironmentParameters, Environment
+from rl_coach.level_manager import LevelManager
+from rl_coach.utils import short_dynamic_import
+
+from rl_coach.core_types import EnvironmentSteps
+from rl_coach.graph_managers.graph_manager import GraphManager, ScheduleParameters
+
+
+class HRLGraphManager(GraphManager):
+    """
+    A simple HRL graph manager creates a deep hierarchy with a single composite agent per hierarchy level, and a single
+    environment which is interacted with.
+    """
+    def __init__(self, agents_params: List[AgentParameters], env_params: EnvironmentParameters,
+                 schedule_params: ScheduleParameters, vis_params: VisualizationParameters,
+                 consecutive_steps_to_run_each_level: Union[EnvironmentSteps, List[EnvironmentSteps]],
+                 preset_validation_params: PresetValidationParameters = PresetValidationParameters()):
+        """
+        :param agents_params: the parameters of all the agents in the hierarchy starting from the top level of the
+                              hierarchy to the bottom level
+        :param env_params: the parameters of the environment
+        :param schedule_params: the parameters for scheduling the graph
+        :param vis_params: the visualization parameters
+        :param consecutive_steps_to_run_each_level: the number of time steps that each level is ran.
+            for example, when the top level gives the bottom level a goal, the bottom level can act for
+            consecutive_steps_to_run_each_level steps and try to reach that goal. This is expected to be either
+            an EnvironmentSteps which will be used for all levels, or an EnvironmentSteps for each level as a list.
+        """
+        super().__init__('hrl_graph', schedule_params, vis_params)
+        self.agents_params = agents_params
+        self.env_params = env_params
+        self.preset_validation_params = preset_validation_params
+        if isinstance(consecutive_steps_to_run_each_level, list):
+            if len(consecutive_steps_to_run_each_level) != len(self.agents_params):
+                raise ValueError("If the consecutive_steps_to_run_each_level is given as a list, it should match "
+                                 "the number of levels in the hierarchy. Alternatively, it is possible to use a single "
+                                 "value for all the levels, by passing an EnvironmentSteps")
+        elif isinstance(consecutive_steps_to_run_each_level, EnvironmentSteps):
+            self.consecutive_steps_to_run_each_level = [consecutive_steps_to_run_each_level] * len(self.agents_params)
+
+        for agent_params in agents_params:
+            agent_params.visualization = self.visualization_parameters
+            if agent_params.input_filter is None:
+                agent_params.input_filter = self.env_params.default_input_filter()
+            if agent_params.output_filter is None:
+                agent_params.output_filter = self.env_params.default_output_filter()
+
+        if len(self.agents_params) < 2:
+            raise ValueError("The HRL graph manager must receive the agent parameters for at least two levels of the "
+                             "hierarchy. Otherwise, use the basic RL graph manager.")
+
+    def _create_graph(self, task_parameters: TaskParameters) -> Tuple[List[LevelManager], List[Environment]]:
+        self.env_params.seed = task_parameters.seed
+        env = short_dynamic_import(self.env_params.path)(**self.env_params.__dict__,
+                                                         visualization_parameters=self.visualization_parameters)
+
+        for agent_params in self.agents_params:
+            agent_params.task_parameters = task_parameters
+
+        # we need to build the hierarchy in reverse order (from the bottom up) in order for the spaces of each level
+        # to be known
+        level_managers = []
+        current_env = env
+        # out_action_space = env.action_space
+        for level_idx, agent_params in reversed(list(enumerate(self.agents_params))):
+            # TODO: the code below is specific for HRL on observation scale
+            # in action space
+            # if level_idx == 0:
+            #     # top level agents do not get directives
+            #     in_action_space = None
+            # else:
+            #     pass
+
+                # attention_size = (env.state_space['observation'].shape - 1)//4
+                # in_action_space = AttentionActionSpace(shape=2, low=0, high=env.state_space['observation'].shape - 1,
+                #                             forced_attention_size=attention_size)
+                # agent_params.output_filter.action_filters['masking'].set_masking(0, attention_size)
+
+            agent_params.name = "agent_{}".format(level_idx)
+            agent_params.is_a_highest_level_agent = level_idx == 0
+            agent = short_dynamic_import(agent_params.path)(agent_params)
+
+            level_manager = LevelManager(
+                agents=agent,
+                environment=current_env,
+                real_environment=env,
+                steps_limit=self.consecutive_steps_to_run_each_level[level_idx],
+                should_reset_agent_state_after_time_limit_passes=level_idx > 0,
+                name="level_{}".format(level_idx)
+            )
+            current_env = level_manager
+            level_managers.insert(0, level_manager)
+
+            # out_action_space = in_action_space
+
+        return level_managers, [env]
+
+
diff --git a/rl_coach/level_manager.py b/rl_coach/level_manager.py
new file mode 100644
index 0000000..52cc410
--- /dev/null
+++ b/rl_coach/level_manager.py
@@ -0,0 +1,258 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+import copy
+from typing import Union, Dict, Tuple, Type
+
+from rl_coach.environments.environment import Environment
+from rl_coach.environments.environment_interface import EnvironmentInterface
+from rl_coach.spaces import ActionSpace, SpacesDefinition
+
+from rl_coach.agents.composite_agent import CompositeAgent
+from rl_coach.core_types import EnvResponse, ActionInfo, RunPhase, ActionType, EnvironmentSteps
+
+
+class LevelManager(EnvironmentInterface):
+    """
+    The LevelManager is in charge of managing a level in the hierarchy of control. Each level can have one or more
+    CompositeAgents and an environment to control. Its API is double-folded:
+        1. Expose services of a LevelManager such as training the level, or stepping it (while behaving according to a
+           LevelBehaviorScheme, e.g. as SelfPlay between two identical agents). These methods are implemented in the
+           LevelManagerLogic class.
+        2. Disguise as appearing as an environment to the upper level control so it will believe it is interacting with
+           an environment. This includes stepping through what appears to be a regular environment, setting its phase
+           or resetting it. These methods are implemented directly in LevelManager as it inherits from
+           EnvironmentInterface.
+    """
+    def __init__(self,
+                 name: str,
+                 agents: Union['Agent', CompositeAgent, Dict[str, Union['Agent', CompositeAgent]]],
+                 environment: Union['LevelManager', Environment],
+                 real_environment: Environment=None,
+                 steps_limit: EnvironmentSteps=EnvironmentSteps(1),
+                 should_reset_agent_state_after_time_limit_passes: bool=False
+                 ):
+        """
+        A level manager controls a single or multiple composite agents and a single environment.
+        The environment can be either a real environment or another level manager behaving as an environment.
+        :param agents: a list of agents or composite agents to control
+        :param environment: an environment or level manager to control
+        :param real_environment: the real environment that is is acted upon. if this is None (which it should be for
+         the most bottom level), it will be replaced by the environment parameter. For simple RL schemes, where there
+         is only a single level of hierarchy, this removes the requirement of defining both the environment and the
+         real environment, as they are the same.
+        :param steps_limit: the number of time steps to run when stepping the internal components
+        :param should_reset_agent_state_after_time_limit_passes: reset the agent after stepping for steps_limit
+        :param name: the level's name
+        """
+        super().__init__()
+
+        if not isinstance(agents, dict):
+            # insert the single composite agent to a dictionary for compatibility
+            agents = {agents.name: agents}
+        if real_environment is None:
+            self._real_environment = real_environment = environment
+        self.agents = agents
+        self.environment = environment
+        self.real_environment = real_environment
+        self.steps_limit = steps_limit
+        self.should_reset_agent_state_after_time_limit_passes = should_reset_agent_state_after_time_limit_passes
+        self.full_name_id = self.name = name
+        self._phase = RunPhase.HEATUP
+        self.level_was_reset = True
+
+        # set self as the parent for all the composite agents
+        for agent in self.agents.values():
+            agent.parent = self
+            agent.parent_level_manager = self
+
+        # create all agents in all composite_agents - we do it here so agents will have access to their level manager
+        for agent in self.agents.values():
+            if isinstance(agent, CompositeAgent):
+                agent.create_agents()
+
+        if not isinstance(self.steps_limit, EnvironmentSteps):
+            raise ValueError("The num consecutive steps for acting must be defined in terms of environment steps")
+        self.build()
+
+        self.last_env_response = self.real_environment.last_env_response
+        self.parent_graph_manager = None
+
+    def handle_episode_ended(self) -> None:
+        """
+        End the environment episode
+        :return: None
+        """
+        [agent.handle_episode_ended() for agent in self.agents.values()]
+
+    def reset_internal_state(self, force_environment_reset: bool = False) -> EnvResponse:
+        """
+        Reset the environment episode parameters
+        :param force_enviro nment_reset: in some cases, resetting the environment can be suppressed by the environment
+                                        itself. This flag allows force the reset.
+        :return: the environment response as returned in get_last_env_response
+        """
+        [agent.reset_internal_state() for agent in self.agents.values()]
+        self.level_was_reset = True
+        if self.real_environment.current_episode_steps_counter == 0:
+            self.last_env_response = self.real_environment.last_env_response
+        return self.last_env_response
+
+    @property
+    def action_space(self) -> Dict[str, ActionSpace]:
+        """
+        Get the action space of each of the agents wrapped in this environment.
+        :return: the action space
+        """
+        cagents_dict = self.agents
+        cagents_names = cagents_dict.keys()
+
+        return {name: cagents_dict[name].in_action_space for name in cagents_names}
+
+    def get_random_action(self) -> Dict[str, ActionType]:
+        """
+        Get a random action from the environment action space
+        :return: An action that follows the definition of the action space.
+        """
+        action_spaces = self.action_space  # The action spaces of the abstracted composite agents in this level
+        return {name: action_space.sample() for name, action_space in action_spaces.items()}
+
+    def get_random_action_with_info(self) -> Dict[str, ActionInfo]:
+        """
+        Get a random action from the environment action space and wrap it with additional info
+        :return: An action that follows the definition of the action space with additional generated info.
+        """
+        return {k: ActionInfo(v) for k, v in self.get_random_action().items()}
+
+    def build(self) -> None:
+        """
+        Build all the internal components of the level manager (composite agents and environment).
+        :return: None
+        """
+        # TODO: move the spaces definition class to the environment?
+        action_space = self.environment.action_space
+        if isinstance(action_space, dict):  # TODO: shouldn't be a dict
+            action_space = list(action_space.values())[0]
+        spaces = SpacesDefinition(state=self.real_environment.state_space,
+                                  goal=self.real_environment.goal_space,  # in HRL the agent might want to override this
+                                  action=action_space,
+                                  reward=self.real_environment.reward_space)
+        [agent.set_environment_parameters(spaces) for agent in self.agents.values()]
+
+    def setup_logger(self) -> None:
+        """
+        Setup the logger for all the agents in the level
+        :return: None
+        """
+        [agent.setup_logger() for agent in self.agents.values()]
+
+    def set_session(self, sess) -> None:
+        """
+        Set the deep learning framework session for all the composite agents in the level manager
+        :return: None
+        """
+        [agent.set_session(sess) for agent in self.agents.values()]
+
+    def train(self) -> None:
+        """
+        Make a training step for all the composite agents in this level manager
+        :return: the loss?
+        """
+        # both to screen and to csv
+        [agent.train() for agent in self.agents.values()]
+
+    @property
+    def phase(self) -> RunPhase:
+        """
+        Get the phase of the level manager
+        :return: the current phase
+        """
+        return self._phase
+
+    @phase.setter
+    def phase(self, val: RunPhase):
+        """
+        Change the phase of the level manager and all the hierarchy levels below it
+        :param val: the new phase
+        :return: None
+        """
+        self._phase = val
+        for agent in self.agents.values():
+            agent.phase = val
+
+    def step(self, action: Union[None, Dict[str, ActionType]]) -> EnvResponse:
+        """
+        Run a single step of following the behavioral scheme set for this environment.
+        :param action: the action to apply to the agents held in this level, before beginning following
+                       the scheme.
+        :return: None
+        """
+        # set the incoming directive for the sub-agent (goal / skill selection / etc.)
+        if action is not None:
+            for agent_name, agent in self.agents.items():
+                agent.set_incoming_directive(action)
+
+        # get last response or initial response from the environment
+        env_response = copy.copy(self.environment.last_env_response)
+
+        # step for several time steps
+        accumulated_reward = 0
+        acting_agent = list(self.agents.values())[0]
+
+        for i in range(self.steps_limit.num_steps):
+            # let the agent observe the result and decide if it wants to terminate the episode
+            done = acting_agent.observe(env_response)
+
+            if done:
+                break
+            else:
+                # get action
+                action_info = acting_agent.act()
+
+                # step environment
+                env_response = self.environment.step(action_info.action)
+
+                # accumulate rewards such that the master policy will see the total reward during the step phase
+                accumulated_reward += env_response.reward
+
+        # update the env response that will be exposed to the parent agent
+        env_response_for_upper_level = copy.copy(env_response)
+        env_response_for_upper_level.reward = accumulated_reward
+        self.last_env_response = env_response_for_upper_level
+
+        # if the environment terminated the episode -> let the agent observe the last response
+        # in HRL,excluding top level one, we will always enter the below if clause
+        # (because should_reset_agent_state_after_time_limit_passes is set to True)
+        if env_response.game_over or self.should_reset_agent_state_after_time_limit_passes:
+            # this is the agent's only opportunity to observe this transition - he will not get another one
+            acting_agent.observe(env_response)  # TODO: acting agent? maybe all of the agents in the layer?
+            self.handle_episode_ended()
+            self.reset_internal_state()
+
+        return env_response_for_upper_level
+
+    def save_checkpoint(self, checkpoint_id: int) -> None:
+        """
+        Save checkpoints of the networks of all agents
+        :return: None
+        """
+        [agent.save_checkpoint(checkpoint_id) for agent in self.agents.values()]
+
+    def sync(self) -> None:
+        """
+        Sync the networks of the agents with the global network parameters
+        :return:
+        """
+        [agent.sync() for agent in self.agents.values()]
diff --git a/rl_coach/local_batch_run_coach.py b/rl_coach/local_batch_run_coach.py
new file mode 100644
index 0000000..4ee614e
--- /dev/null
+++ b/rl_coach/local_batch_run_coach.py
@@ -0,0 +1,125 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import sys
+sys.path.append('.')
+from subprocess import Popen
+import argparse
+from rl_coach.utils import set_gpu, force_list
+
+"""
+This script makes it easier to run multiple instances of a given preset.
+Each instance uses a different seed, and optionally, multiple environment levels can be configured as well.
+"""
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument('-p', '--preset',
+                        help="(string) The preset to run",
+                        default=None,
+                        type=str)
+    parser.add_argument('-s', '--seeds',
+                        help="(int) Number of seeds to run",
+                        default=5,
+                        type=int)
+    parser.add_argument('-lvl', '--level',
+                        help="(string) Environment level to use. This can be defined as a comma separated list.",
+                        default=None,
+                        type=str)
+    parser.add_argument('-g', '--gpu',
+                        help="(int) The gpu to use. This can be defined as a comma separated list. For example,"
+                             " 0,1 will use both gpu's 0 and 1, by switching between them for each run instance",
+                        default='0',
+                        type=str)
+    parser.add_argument('-n', '--num_workers',
+                        help="(int) The number of workers to use for each run",
+                        default=1,
+                        type=int)
+    parser.add_argument('-d', '--dir_prefix',
+                        help="(str) A prefix for the directory name. If not given, the directory name will match "
+                             "the preset name, followed by the environment level",
+                        default='',
+                        type=str)
+    parser.add_argument('-sd', '--level_as_sub_dir',
+                        help="(flag) Store each level in it's own sub directory where the root directory name matches "
+                             "the preset name",
+                        action='store_true')
+    parser.add_argument('-ew', '--evaluation_worker',
+                        help="(flag) Start an additional worker that will only do evaluation",
+                        action='store_true')
+    parser.add_argument('-c', '--use_cpu',
+                        help="(flag) Use the cpu instead of the gpu",
+                        action='store_true')
+    args = parser.parse_args()
+
+    # dir_prefix = "benchmark_"
+    # preset = 'Mujoco_DDPG'  # 'Mujoco_DDPG'
+    # levels = ["inverted_pendulum"]
+    # num_seeds = 5
+    # num_workers = 1
+    # gpu = [0, 1]
+    #
+
+    # if no arg is given
+    if len(sys.argv) == 1:
+        parser.print_help()
+        exit(0)
+
+    dir_prefix = args.dir_prefix
+    preset = args.preset
+    levels = args.level.split(',') if args.level is not None else [None]
+    num_seeds = args.seeds
+    num_workers = args.num_workers
+    gpu = [int(gpu) for gpu in args.gpu.split(',')]
+    level_as_sub_dir = args.level_as_sub_dir
+
+    processes = []
+    gpu_list = force_list(gpu)
+    curr_gpu_idx = 0
+    for level in levels:
+        for seed in range(num_seeds):
+            # select the next gpu for this run
+            set_gpu(gpu_list[curr_gpu_idx])
+
+            command = ['python3', 'rl_coach/coach.py', '-ns', '-p', '{}'.format(preset),
+                        '--seed', '{}'.format(seed), '-n', '{}'.format(num_workers)]
+            if dir_prefix != "":
+                dir_prefix += "_"
+            if args.use_cpu:
+                command.append("-c")
+            if args.evaluation_worker:
+                command.append("-ew")
+            if level is not None:
+                command.extend(['-lvl', '{}'.format(level)])
+                if level_as_sub_dir:
+                    separator = "/"
+                else:
+                    separator = "_"
+                command.extend(['-e', '{dir_prefix}{preset}{separator}{level}_{num_workers}_workers'.format(
+                    dir_prefix=dir_prefix, preset=preset, level=level, separator=separator, num_workers=args.num_workers)])
+            else:
+                command.extend(['-e', '{dir_prefix}{preset}_{num_workers}_workers'.format(
+                    dir_prefix=dir_prefix, preset=preset, num_workers=args.num_workers)])
+            print(command)
+
+            p = Popen(command)
+            processes.append(p)
+
+            # for each run, select the next gpu from the available gpus
+            curr_gpu_idx = (curr_gpu_idx + 1) % len(gpu_list)
+
+    for p in processes:
+        p.wait()
diff --git a/rl_coach/logger.py b/rl_coach/logger.py
new file mode 100644
index 0000000..02f1a9c
--- /dev/null
+++ b/rl_coach/logger.py
@@ -0,0 +1,378 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+import atexit
+import datetime
+import os
+import re
+import shutil
+import time
+from subprocess import Popen, PIPE
+from typing import Union
+
+from PIL import Image
+from pandas import DataFrame
+from six.moves import input
+
+global failed_imports
+failed_imports = []
+
+
+class Colors(object):
+    PURPLE = '\033[95m'
+    CYAN = '\033[96m'
+    DARKCYAN = '\033[36m'
+    BLUE = '\033[94m'
+    GREEN = '\033[92m'
+    YELLOW = '\033[93m'
+    RED = '\033[91m'
+    WHITE = '\033[37m'
+    BG_RED = '\033[41m'
+    BG_GREEN = '\033[42m'
+    BG_YELLOW = '\033[43m'
+    BG_BLUE = '\033[44m'
+    BG_PURPLE = '\033[45m'
+    BG_CYAN = '\033[30;46m'
+    BG_WHITE = '\x1b[30;47m'
+    BG_RESET = '\033[49m'
+    BOLD = '\033[1m'
+    UNDERLINE_ON = '\033[4m'
+    UNDERLINE_OFF = '\033[24m'
+    END = '\033[0m'
+
+
+# prints to screen with a prefix identifying the origin of the print
+class ScreenLogger(object):
+    def __init__(self, name):
+        self.name = name
+
+    def separator(self):
+        print("")
+        print("--------------------------------")
+        print("")
+
+    def log(self, data):
+        print(data)
+
+    def log_dict(self, dict, prefix=""):
+        str = "{}{}{} - ".format(Colors.PURPLE, prefix, Colors.END)
+        for k, v in dict.items():
+            str += "{}{}: {}{} ".format(Colors.BLUE, k, Colors.END, v)
+        print(str)
+
+    def log_title(self, title):
+        print("{}{}{}".format(Colors.BG_CYAN, title, Colors.END))
+
+    def success(self, text):
+        print("{}{}{}".format(Colors.GREEN, text, Colors.END))
+
+    def warning(self, text):
+        print("{}{}{}".format(Colors.YELLOW, text, Colors.END))
+
+    def error(self, text, crash=True):
+        print("{}{}{}".format(Colors.RED, text, Colors.END))
+        if crash:
+            atexit.unregister(summarize_experiment)
+            exit(1)
+
+    def ask_input(self, title):
+        return input("{}{}{}".format(Colors.BG_CYAN, title, Colors.END))
+
+    def ask_yes_no(self, title: str, default: Union[None, bool] = None):
+        """
+        Ask the user for a yes / no question and return True if the answer is yes and False otherwise.
+        The function will keep asking the user for an answer until he answers one of the possible responses.
+        A default answer can be passed and will be selected if the user presses enter
+        :param title: The question to ask the user
+        :param default: the default answer
+        :return: True / False according to the users answer
+        """
+        default_answer = 'y/n'
+        if default == True:
+            default_answer = 'Y/n'
+        elif default == False:
+            default_answer = 'y/N'
+
+        while True:
+            answer = input("{}{}{} ({})".format(Colors.BG_CYAN, title, Colors.END, default_answer))
+            if answer == "yes" or answer == "YES" or answer == "y" or answer == "Y":
+                return True
+            elif answer == "no" or answer == "NO" or answer == "n" or answer == "N":
+                return False
+            elif answer == "":
+                if default is not None:
+                    return default
+
+    def change_terminal_title(self, title: str):
+        """
+        Changes the title of the terminal window
+        :param title: The new title
+        :return: None
+        """
+        print("\x1b]2;{}\x07".format(title))
+
+
+class BaseLogger(object):
+    def __init__(self):
+        self.data = DataFrame()
+        self.csv_path = ''
+        self.start_time = None
+        self.time = None
+        self.experiments_path = ""
+        self.last_line_idx_written_to_csv = 0
+        self.experiment_name = ""
+        self.index_name = "Index"
+
+    def set_current_time(self, time):
+        self.time = time
+
+    def create_signal_value(self, signal_name, value, overwrite=True, time=None):
+        if self.last_line_idx_written_to_csv != 0:
+            assert signal_name in self.data.columns
+
+        if not time:
+            time = self.time
+        # create only if it doesn't already exist
+        if overwrite or not self.signal_value_exists(time, signal_name):
+            self.data.loc[time, signal_name] = value
+            return True
+        return False
+
+    def change_signal_value(self, signal_name, time, value):
+        # change only if it already exists
+        if self.signal_value_exists(time, signal_name):
+            self.data.loc[time, signal_name] = value
+            return True
+        return False
+
+    def signal_value_exists(self, time, signal_name):
+        try:
+            value = self.get_signal_value(time, signal_name)
+            if value != value:  # value is nan
+                return False
+        except:
+            return False
+        return True
+
+    def get_signal_value(self, time, signal_name):
+        return self.data.loc[time, signal_name]
+
+    def dump_output_csv(self, append=True):
+        self.data.index.name = self.index_name
+        if len(self.data.index) == 1:
+            self.start_time = time.time()
+
+        if os.path.exists(self.csv_path) and append:
+            self.data[self.last_line_idx_written_to_csv:].to_csv(self.csv_path, mode='a', header=False)
+        else:
+            self.data.to_csv(self.csv_path)
+
+        self.last_line_idx_written_to_csv = len(self.data.index)
+
+    def update_wall_clock_time(self, index):
+        if self.start_time:
+            self.create_signal_value('Wall-Clock Time', time.time() - self.start_time, time=index)
+        else:
+            self.create_signal_value('Wall-Clock Time', 0, time=index)
+            self.start_time = time.time()
+
+
+class EpisodeLogger(BaseLogger):
+    def __init__(self):
+        super().__init__()
+        self.worker_dir_path = ''
+        self.index_name = "Episode Steps"
+
+    def set_logger_filenames(self, _experiments_path, logger_prefix='', task_id=None, add_timestamp=False, filename=''):
+        self.experiments_path = _experiments_path
+
+        # set file names
+        if task_id is not None:
+            filename += "worker_{}.".format(task_id)
+
+        # add timestamp
+        if add_timestamp:
+            filename += logger_prefix
+
+        self.worker_dir_path = os.path.join(_experiments_path, '{}'.format(filename))
+        if not os.path.exists(self.worker_dir_path):
+            os.makedirs(self.worker_dir_path)
+
+    def set_episode_idx(self, episode_idx):
+        self.data = DataFrame()
+        self.csv_path = os.path.join(self.worker_dir_path, 'episode_{}.csv'.format(episode_idx))
+        self.last_line_idx_written_to_csv = 0
+
+
+class Logger(BaseLogger):
+    def __init__(self):
+        super().__init__()
+        self.doc_path = ''
+        self.index_name = 'Episode #'
+
+    def set_logger_filenames(self, _experiments_path, logger_prefix='', task_id=None, add_timestamp=False, filename=''):
+        self.experiments_path = _experiments_path
+
+        # set file names
+        if task_id is not None:
+            filename += "worker_{}.".format(task_id)
+
+        # add timestamp
+        if add_timestamp:
+            filename += logger_prefix
+
+        # add an index to the file in case there is already an experiment running with the same timestamp
+        path_exists = True
+        idx = 0
+        while path_exists:
+            self.csv_path = os.path.join(_experiments_path, '{}_{}.csv'.format(filename, idx))
+            self.doc_path = os.path.join(_experiments_path, '{}_{}.json'.format(filename, idx))
+            path_exists = os.path.exists(self.csv_path) or os.path.exists(self.doc_path)
+            idx += 1
+
+    def dump_documentation(self, parameters):
+        if not os.path.exists(os.path.dirname(self.doc_path)):
+            os.makedirs(self.experiments_path)
+        with open(self.doc_path, 'w') as outfile:
+            outfile.write(parameters)
+
+
+#######################################################################################################################
+#################################### Module Related Methods/Vars ######################################################
+#######################################################################################################################
+
+global experiment_path
+experiment_path = None
+
+global experiment_name
+experiment_name = None
+time_started = datetime.datetime.now()
+
+
+def two_digits(num):
+    return '%02d' % num
+
+
+def create_gif(images, fps=10, name="Gif"):
+    global experiment_path
+
+    output_file = '{}_{}.gif'.format(datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S'), name)
+    output_dir = os.path.join(experiment_path, 'gifs')
+    if not os.path.exists(output_dir):
+        os.makedirs(output_dir)
+    output_path = os.path.join(output_dir, output_file)
+    pil_images = [Image.fromarray(image) for image in images]
+    pil_images[0].save(output_path, save_all=True, append_images=pil_images[1:], duration=1.0 / fps, loop=0)
+
+
+def create_mp4(images, fps=10, name="mp4"):
+    global experiment_path
+
+    output_file = '{}_{}.mp4'.format(datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S'), name)
+    output_dir = os.path.join(experiment_path, 'videos')
+    if not os.path.exists(output_dir):
+        os.makedirs(output_dir)
+    output_path = os.path.join(output_dir, output_file)
+    shape = 'x'.join([str(d) for d in images[0].shape[:2][::-1]])
+    command = ['ffmpeg',
+               '-y',
+               '-f', 'rawvideo',
+               '-s', shape,
+               '-pix_fmt', 'rgb24',
+               '-r', str(fps),
+               '-i', '-',
+               '-vcodec', 'libx264',
+               '-pix_fmt', 'yuv420p',
+               output_path]
+
+    p = Popen(command, stdin=PIPE, stderr=PIPE)
+    for image in images:
+        p.stdin.write(image.tostring())
+    p.stdin.close()
+    p.wait()
+
+
+def remove_experiment_dir():
+    shutil.rmtree(experiment_path)
+
+
+def summarize_experiment():
+    screen.separator()
+    screen.log_title("Results stored at: {}".format(experiment_path))
+    screen.log_title("Total runtime: {}".format(datetime.datetime.now() - time_started))
+    # TODO: reimplement the following code to print out the max reward during the training
+    # if 'Training Reward' in self.data.keys() and 'Evaluation Reward' in self.data.keys():
+    #     screen.log_title("Max training reward: {}, max evaluation reward: {}".format(
+    # self.data['Training Reward'].max(), self.data['Evaluation Reward'].max()))
+    screen.separator()
+    if screen.ask_yes_no("Do you want to discard the experiment results (Warning: this cannot be undone)?", False):
+        remove_experiment_dir()
+    elif screen.ask_yes_no("Do you want to specify a different experiment name to save to?", False):
+        new_name = get_experiment_name()
+        old_path = experiment_path
+        new_path = get_experiment_path(new_name, create_path=False)
+        shutil.move(old_path, new_path)
+        screen.log_title("Results moved to: {}".format(new_path))
+
+
+def get_experiment_name(initial_experiment_name=''):
+    global experiment_name
+
+    match = None
+    while match is None:
+        if initial_experiment_name == '':
+            experiment_name = screen.ask_input("Please enter an experiment name: ")
+        else:
+            experiment_name = initial_experiment_name
+
+        experiment_name = experiment_name.replace(" ", "_")
+        match = re.match("^$|^[\w -/]{1,1000}$", experiment_name)
+
+        if match is None:
+            screen.error('Experiment name must be composed only of alphanumeric letters, '
+                         'underscores and dashes and should not be longer than 1000 characters.')
+
+    experiment_name = match.group(0)
+    return experiment_name
+
+
+def get_experiment_path(experiment_name, create_path=True):
+    global experiment_path
+
+    general_experiments_path = os.path.join('./experiments/', experiment_name)
+
+    cur_date = time_started.date()
+    cur_time = time_started.time()
+
+    if not os.path.exists(general_experiments_path) and create_path:
+        os.makedirs(general_experiments_path)
+    experiment_path = os.path.join(general_experiments_path, '{}_{}_{}-{}_{}'
+                                   .format(two_digits(cur_date.day), two_digits(cur_date.month),
+                                           cur_date.year, two_digits(cur_time.hour),
+                                           two_digits(cur_time.minute)))
+    i = 0
+    while True:
+        if os.path.exists(experiment_path):
+            experiment_path = os.path.join(general_experiments_path, '{}_{}_{}-{}_{}_{}'
+                                           .format(cur_date.day, cur_date.month, cur_date.year, cur_time.hour,
+                                                   cur_time.minute, i))
+            i += 1
+        else:
+            if create_path:
+                os.makedirs(experiment_path)
+            return experiment_path
+
+global screen
+screen = ScreenLogger("")
diff --git a/rl_coach/memories/__init__.py b/rl_coach/memories/__init__.py
new file mode 100644
index 0000000..65ca517
--- /dev/null
+++ b/rl_coach/memories/__init__.py
@@ -0,0 +1,15 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
diff --git a/rl_coach/memories/episodic/__init__.py b/rl_coach/memories/episodic/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/memories/episodic/episodic_experience_replay.py b/rl_coach/memories/episodic/episodic_experience_replay.py
new file mode 100644
index 0000000..9520320
--- /dev/null
+++ b/rl_coach/memories/episodic/episodic_experience_replay.py
@@ -0,0 +1,318 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List, Tuple, Union, Dict, Any
+
+import numpy as np
+from rl_coach.utils import ReaderWriterLock
+
+from rl_coach.core_types import Transition, Episode
+from rl_coach.memories.memory import Memory, MemoryGranularity, MemoryParameters
+
+
+class EpisodicExperienceReplayParameters(MemoryParameters):
+    def __init__(self):
+        super().__init__()
+        self.max_size = (MemoryGranularity.Transitions, 1000000)
+
+    @property
+    def path(self):
+        return 'rl_coach.memories.episodic.episodic_experience_replay:EpisodicExperienceReplay'
+
+
+class EpisodicExperienceReplay(Memory):
+    """
+    A replay buffer that stores episodes of transitions. The additional structure allows performing various
+    calculations of total return and other values that depend on the sequential behavior of the transitions
+    in the episode.
+    """
+    def __init__(self, max_size: Tuple[MemoryGranularity, int]):
+        """
+        :param max_size: the maximum number of transitions or episodes to hold in the memory
+        """
+        super().__init__(max_size)
+
+        self._buffer = [Episode()]  # list of episodes
+        self.transitions = []
+        self._length = 1  # the episodic replay buffer starts with a single empty episode
+        self._num_transitions = 0
+        self._num_transitions_in_complete_episodes = 0
+
+        self.reader_writer_lock = ReaderWriterLock()
+
+    def length(self, lock: bool=False) -> int:
+        """
+        Get the number of episodes in the ER (even if they are not complete)
+        """
+        length = self._length
+        if self._length is not 0 and self._buffer[-1].is_empty():
+            length = self._length - 1
+
+        return length
+
+    def num_complete_episodes(self):
+        """ Get the number of complete episodes in ER """
+        length = self._length - 1
+
+        return length
+
+    def num_transitions(self):
+        return self._num_transitions
+
+    def num_transitions_in_complete_episodes(self):
+        return self._num_transitions_in_complete_episodes
+
+    def sample(self, size: int) -> List[Transition]:
+        """
+        Sample a batch of transitions form the replay buffer. If the requested size is larger than the number
+        of samples available in the replay buffer then the batch will return empty.
+        :param size: the size of the batch to sample
+        :return: a batch (list) of selected transitions from the replay buffer
+        """
+        self.reader_writer_lock.lock_writing()
+
+        if self.num_complete_episodes() >= 1:
+            transitions_idx = np.random.randint(self.num_transitions_in_complete_episodes(), size=size)
+            batch = [self.transitions[i] for i in transitions_idx]
+
+        else:
+            raise ValueError("The episodic replay buffer cannot be sampled since there are no complete episodes yet. "
+                             "There is currently 1 episodes with {} transitions".format(self._buffer[0].length()))
+
+        self.reader_writer_lock.release_writing()
+
+        return batch
+
+    def _enforce_max_length(self) -> None:
+        """
+        Make sure that the size of the replay buffer does not pass the maximum size allowed.
+        If it passes the max size, the oldest episode in the replay buffer will be removed.
+        :return: None
+        """
+        granularity, size = self.max_size
+        if granularity == MemoryGranularity.Transitions:
+            while size != 0 and self.num_transitions() > size:
+                self._remove_episode(0)
+        elif granularity == MemoryGranularity.Episodes:
+            while self.length() > size:
+                self._remove_episode(0)
+
+    def _update_episode(self, episode: Episode) -> None:
+        episode.update_returns()
+
+    def verify_last_episode_is_closed(self) -> None:
+        """
+        Verify that there is no open episodes in the replay buffer
+        :return: None
+        """
+        self.reader_writer_lock.lock_writing_and_reading()
+
+        last_episode = self.get(-1, False)
+        if last_episode and last_episode.length() > 0:
+            self.close_last_episode(lock=False)
+
+        self.reader_writer_lock.release_writing_and_reading()
+
+    def close_last_episode(self, lock=True) -> None:
+        """
+        Close the last episode in the replay buffer and open a new one
+        :return: None
+        """
+        if lock:
+            self.reader_writer_lock.lock_writing_and_reading()
+
+        last_episode = self._buffer[-1]
+
+        self._num_transitions_in_complete_episodes += last_episode.length()
+        self._length += 1
+
+        # create a new Episode for the next transitions to be placed into
+        self._buffer.append(Episode())
+
+        # if update episode adds to the buffer, a new Episode needs to be ready first
+        # it would be better if this were less state full
+        self._update_episode(last_episode)
+
+        self._enforce_max_length()
+
+        if lock:
+            self.reader_writer_lock.release_writing_and_reading()
+
+    def store(self, transition: Transition) -> None:
+        """
+        Store a new transition in the memory. If the transition game_over flag is on, this closes the episode and
+        creates a new empty episode.
+        Warning! using the episodic memory by storing individual transitions instead of episodes will use the default
+        Episode class parameters in order to create new episodes.
+        :param transition: a transition to store
+        :return: None
+        """
+        self.reader_writer_lock.lock_writing_and_reading()
+
+        if len(self._buffer) == 0:
+            self._buffer.append(Episode())
+        last_episode = self._buffer[-1]
+        last_episode.insert(transition)
+        self.transitions.append(transition)
+        self._num_transitions += 1
+        if transition.game_over:
+            self.close_last_episode(False)
+
+        self._enforce_max_length()
+
+        self.reader_writer_lock.release_writing_and_reading()
+
+    def store_episode(self, episode: Episode, lock: bool=True) -> None:
+        """
+        Store a new episode in the memory.
+        :param episode: the new episode to store
+        :return: None
+        """
+        if lock:
+            self.reader_writer_lock.lock_writing_and_reading()
+
+        if self._buffer[-1].length() == 0:
+            self._buffer[-1] = episode
+        else:
+            self._buffer.append(episode)
+        self.transitions.extend(episode.transitions)
+        self._num_transitions += episode.length()
+        self.close_last_episode(False)
+
+        if lock:
+            self.reader_writer_lock.release_writing_and_reading()
+
+    def get_episode(self, episode_index: int, lock: bool=True) -> Union[None, Episode]:
+        """
+        Returns the episode in the given index. If the episode does not exist, returns None instead.
+        :param episode_index: the index of the episode to return
+        :return: the corresponding episode
+        """
+        if lock:
+            self.reader_writer_lock.lock_writing()
+
+        if self.length() == 0 or episode_index >= self.length():
+            episode = None
+        else:
+            episode = self._buffer[episode_index]
+
+        if lock:
+            self.reader_writer_lock.release_writing()
+        return episode
+
+    def _remove_episode(self, episode_index: int) -> None:
+        """
+        Remove the episode in the given index (even if it is not complete yet)
+        :param episode_index: the index of the episode to remove
+        :return: None
+        """
+        if len(self._buffer) > episode_index:
+            episode_length = self._buffer[episode_index].length()
+            self._length -= 1
+            self._num_transitions -= episode_length
+            self._num_transitions_in_complete_episodes -= episode_length
+            del self.transitions[:episode_length]
+            del self._buffer[episode_index]
+
+    def remove_episode(self, episode_index: int) -> None:
+        """
+        Remove the episode in the given index (even if it is not complete yet)
+        :param episode_index: the index of the episode to remove
+        :return: None
+        """
+        self.reader_writer_lock.lock_writing_and_reading()
+
+        self._remove_episode(episode_index)
+
+        self.reader_writer_lock.release_writing_and_reading()
+
+    # for API compatibility
+    def get(self, episode_index: int, lock: bool=True) -> Union[None, Episode]:
+        """
+        Returns the episode in the given index. If the episode does not exist, returns None instead.
+        :param episode_index: the index of the episode to return
+        :return: the corresponding episode
+        """
+        return self.get_episode(episode_index, lock)
+
+    def get_last_complete_episode(self) -> Union[None, Episode]:
+        """
+        Returns the last complete episode in the memory or None if there are no complete episodes
+        :return: None or the last complete episode
+        """
+        self.reader_writer_lock.lock_writing()
+
+        last_complete_episode_index = self.num_complete_episodes() - 1
+        episode = None
+        if last_complete_episode_index >= 0:
+            episode = self.get(last_complete_episode_index)
+
+        self.reader_writer_lock.release_writing()
+
+        return episode
+
+    # for API compatibility
+    def remove(self, episode_index: int):
+        """
+        Remove the episode in the given index (even if it is not complete yet)
+        :param episode_index: the index of the episode to remove
+        :return: None
+        """
+        self.remove_episode(episode_index)
+
+    def update_last_transition_info(self, info: Dict[str, Any]) -> None:
+        """
+        Update the info of the last transition stored in the memory
+        :param info: the new info to append to the existing info
+        :return: None
+        """
+        self.reader_writer_lock.lock_writing_and_reading()
+
+        episode = self._buffer[-1]
+        if episode.length() == 0:
+            if len(self._buffer) < 2:
+                return
+            episode = self._buffer[-2]
+        episode.transitions[-1].info.update(info)
+
+        self.reader_writer_lock.release_writing_and_reading()
+
+    def clean(self) -> None:
+        """
+        Clean the memory by removing all the episodes
+        :return: None
+        """
+        self.reader_writer_lock.lock_writing_and_reading()
+
+        self.transitions = []
+        self._buffer = [Episode()]
+        self._length = 1
+        self._num_transitions = 0
+        self._num_transitions_in_complete_episodes = 0
+
+        self.reader_writer_lock.release_writing_and_reading()
+
+    def mean_reward(self) -> np.ndarray:
+        """
+        Get the mean reward in the replay buffer
+        :return: the mean reward
+        """
+        self.reader_writer_lock.lock_writing()
+
+        mean = np.mean([transition.reward for transition in self.transitions])
+
+        self.reader_writer_lock.release_writing()
+        return mean
diff --git a/rl_coach/memories/episodic/episodic_hindsight_experience_replay.py b/rl_coach/memories/episodic/episodic_hindsight_experience_replay.py
new file mode 100644
index 0000000..7a9ac7b
--- /dev/null
+++ b/rl_coach/memories/episodic/episodic_hindsight_experience_replay.py
@@ -0,0 +1,147 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import copy
+from enum import Enum
+from typing import Tuple, List
+
+import numpy as np
+
+from rl_coach.core_types import Episode, Transition
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplayParameters, EpisodicExperienceReplay
+from rl_coach.memories.non_episodic.experience_replay import MemoryGranularity
+from rl_coach.spaces import GoalsSpace
+
+
+class HindsightGoalSelectionMethod(Enum):
+    Future = 0
+    Final = 1
+    Episode = 2
+    Random = 3
+
+
+class EpisodicHindsightExperienceReplayParameters(EpisodicExperienceReplayParameters):
+    def __init__(self):
+        super().__init__()
+        self.hindsight_transitions_per_regular_transition = None
+        self.hindsight_goal_selection_method = None
+        self.goals_space = None
+
+    @property
+    def path(self):
+        return 'rl_coach.memories.episodic.episodic_hindsight_experience_replay:EpisodicHindsightExperienceReplay'
+
+
+class EpisodicHindsightExperienceReplay(EpisodicExperienceReplay):
+    """
+    Implements Hindsight Experience Replay as described in the following paper: https://arxiv.org/pdf/1707.01495.pdf
+
+    """
+    def __init__(self, max_size: Tuple[MemoryGranularity, int],
+                 hindsight_transitions_per_regular_transition: int,
+                 hindsight_goal_selection_method: HindsightGoalSelectionMethod,
+                 goals_space: GoalsSpace):
+        """
+        :param max_size: The maximum size of the memory. should be defined in a granularity of Transitions
+        :param hindsight_transitions_per_regular_transition: The number of hindsight artificial transitions to generate
+                                                             for each actual transition
+        :param hindsight_goal_selection_method: The method that will be used for generating the goals for the
+                                                hindsight transitions. Should be one of HindsightGoalSelectionMethod
+        :param goals_space: A GoalsSpace which defines the base properties of the goals space
+        """
+        super().__init__(max_size)
+
+        self.hindsight_transitions_per_regular_transition = hindsight_transitions_per_regular_transition
+        self.hindsight_goal_selection_method = hindsight_goal_selection_method
+        self.goals_space = goals_space
+        self.last_episode_start_idx = 0
+
+    def _sample_goal(self, episode_transitions: List, transition_index: int):
+        """
+        Sample a single goal state according to the sampling method
+        :param episode_transitions: a list of all the transitions in the current episode
+        :param transition_index: the transition to start sampling from
+        :return: a goal corresponding to the sampled state
+        """
+        if self.hindsight_goal_selection_method == HindsightGoalSelectionMethod.Future:
+            # states that were observed in the same episode after the transition that is being replayed
+            selected_transition = np.random.choice(episode_transitions[transition_index+1:])
+        elif self.hindsight_goal_selection_method == HindsightGoalSelectionMethod.Final:
+            # the final state in the episode
+            selected_transition = episode_transitions[-1]
+        elif self.hindsight_goal_selection_method == HindsightGoalSelectionMethod.Episode:
+            # a random state from the episode
+            selected_transition = np.random.choice(episode_transitions)
+        elif self.hindsight_goal_selection_method == HindsightGoalSelectionMethod.Random:
+            # a random state from the entire replay buffer
+            selected_transition = np.random.choice(self.transitions)
+        else:
+            raise ValueError("Invalid goal selection method was used for the hindsight goal selection")
+        return self.goals_space.goal_from_state(selected_transition.state)
+
+    def _sample_goals(self, episode_transitions: List, transition_index: int):
+        """
+        Sample a batch of goal states according to the sampling method
+        :param episode_transitions: a list of all the transitions in the current episode
+        :param transition_index: the transition to start sampling from
+        :return: a goal corresponding to the sampled state
+        """
+        return [
+            self._sample_goal(episode_transitions, transition_index)
+            for _ in range(self.hindsight_transitions_per_regular_transition)
+        ]
+
+    def store_episode(self, episode: Episode, lock: bool=True) -> None:
+        # generate hindsight transitions only when an episode is finished
+        last_episode_transitions = copy.copy(episode.transitions)
+
+        # cannot create a future hindsight goal in the last transition of an episode
+        if self.hindsight_goal_selection_method == HindsightGoalSelectionMethod.Future:
+            relevant_base_transitions = last_episode_transitions[:-1]
+        else:
+            relevant_base_transitions = last_episode_transitions
+
+        # for each transition in the last episode, create a set of hindsight transitions
+        for transition_index, transition in enumerate(relevant_base_transitions):
+            sampled_goals = self._sample_goals(last_episode_transitions, transition_index)
+            for goal in sampled_goals:
+                hindsight_transition = copy.copy(transition)
+
+                if hindsight_transition.state['desired_goal'].shape != goal.shape:
+                    raise ValueError((
+                        'goal shape {goal_shape} already in transition is '
+                        'different than the one sampled as a hindsight goal '
+                        '{hindsight_goal_shape}.'
+                    ).format(
+                        goal_shape=hindsight_transition.state['desired_goal'].shape,
+                        hindsight_goal_shape=goal.shape,
+                    ))
+
+                # update the goal in the transition
+                hindsight_transition.state['desired_goal'] = goal
+                hindsight_transition.next_state['desired_goal'] = goal
+
+                # update the reward and terminal signal according to the goal
+                hindsight_transition.reward, hindsight_transition.game_over = \
+                    self.goals_space.get_reward_for_goal_and_state(goal, hindsight_transition.next_state)
+
+                hindsight_transition.total_return = None
+                episode.insert(hindsight_transition)
+
+        super().store_episode(episode)
+
+    def store(self, transition: Transition):
+        raise ValueError("An episodic HER cannot store a single transition. Only full episodes are to be stored.")
diff --git a/rl_coach/memories/episodic/episodic_hrl_hindsight_experience_replay.py b/rl_coach/memories/episodic/episodic_hrl_hindsight_experience_replay.py
new file mode 100644
index 0000000..8f4c384
--- /dev/null
+++ b/rl_coach/memories/episodic/episodic_hrl_hindsight_experience_replay.py
@@ -0,0 +1,69 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import Tuple
+
+from rl_coach.core_types import Episode, Transition
+from rl_coach.memories.episodic.episodic_hindsight_experience_replay import HindsightGoalSelectionMethod, \
+    EpisodicHindsightExperienceReplay, EpisodicHindsightExperienceReplayParameters
+from rl_coach.memories.non_episodic.experience_replay import MemoryGranularity
+from rl_coach.spaces import GoalsSpace
+
+
+class EpisodicHRLHindsightExperienceReplayParameters(EpisodicHindsightExperienceReplayParameters):
+    def __init__(self):
+        super().__init__()
+
+    @property
+    def path(self):
+        return 'memories.episodic.episodic_hrl_hindsight_experience_replay:EpisodicHRLHindsightExperienceReplay'
+
+
+class EpisodicHRLHindsightExperienceReplay(EpisodicHindsightExperienceReplay):
+    """
+    Implements HRL Hindsight Experience Replay as described in the following paper:  https://arxiv.org/abs/1805.08180
+
+    This is the memory you should use if you want a shared hindsight experience replay buffer between multiple workers
+    """
+    def __init__(self, max_size: Tuple[MemoryGranularity, int],
+                 hindsight_transitions_per_regular_transition: int,
+                 hindsight_goal_selection_method: HindsightGoalSelectionMethod,
+                 goals_space: GoalsSpace,
+                 ):
+        """
+        :param max_size: The maximum size of the memory. should be defined in a granularity of Transitions
+        :param hindsight_transitions_per_regular_transition: The number of hindsight artificial transitions to generate
+                                                             for each actual transition
+        :param hindsight_goal_selection_method: The method that will be used for generating the goals for the
+                                                hindsight transitions. Should be one of HindsightGoalSelectionMethod
+        :param goals_space: A GoalsSpace  which defines the properties of the goals
+        :param do_action_hindsight: Replace the action (sub-goal) given to a lower layer, with the actual achieved goal
+        """
+        super().__init__(max_size, hindsight_transitions_per_regular_transition, hindsight_goal_selection_method,
+                         goals_space)
+
+    def store_episode(self, episode: Episode, lock: bool=True) -> None:
+        # for a layer producing sub-goals, we will replace in hindsight the action (sub-goal) given to the lower
+        # level with the actual achieved goal. the achieved goal (and observation) seen is assumed to be the same
+        # for all levels - we can use this level's achieved goal instead of the lower level's one
+        for transition in episode.transitions:
+            new_achieved_goal = transition.next_state[self.goals_space.goal_name]
+            transition.action = new_achieved_goal
+
+        super().store_episode(episode)
+
+    def store(self, transition: Transition):
+        raise ValueError("An episodic HER cannot store a single transition. Only full episodes are to be stored.")
diff --git a/rl_coach/memories/episodic/single_episode_buffer.py b/rl_coach/memories/episodic/single_episode_buffer.py
new file mode 100644
index 0000000..f1cd64b
--- /dev/null
+++ b/rl_coach/memories/episodic/single_episode_buffer.py
@@ -0,0 +1,34 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from rl_coach.memories.memory import MemoryGranularity, MemoryParameters
+
+from rl_coach.memories.episodic.episodic_experience_replay import EpisodicExperienceReplay
+
+
+class SingleEpisodeBufferParameters(MemoryParameters):
+    def __init__(self):
+        super().__init__()
+        del self.max_size
+
+    @property
+    def path(self):
+        return 'rl_coach.memories.episodic.single_episode_buffer:SingleEpisodeBuffer'
+
+
+class SingleEpisodeBuffer(EpisodicExperienceReplay):
+    def __init__(self):
+        super().__init__((MemoryGranularity.Episodes, 1))
diff --git a/rl_coach/memories/memory.py b/rl_coach/memories/memory.py
new file mode 100644
index 0000000..e5285e0
--- /dev/null
+++ b/rl_coach/memories/memory.py
@@ -0,0 +1,67 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from enum import Enum
+from typing import Tuple
+
+from rl_coach.base_parameters import Parameters
+
+
+class MemoryGranularity(Enum):
+    Transitions = 0
+    Episodes = 1
+
+
+class MemoryParameters(Parameters):
+    def __init__(self):
+        super().__init__()
+        self.max_size = None
+        self.shared_memory = False
+        self.load_memory_from_file_path = None
+
+
+    @property
+    def path(self):
+        return 'rl_coach.memories.memory:Memory'
+
+
+class Memory(object):
+    def __init__(self, max_size: Tuple[MemoryGranularity, int]):
+        """
+        :param max_size: the maximum number of objects to hold in the memory
+        """
+        self.max_size = max_size
+        self._length = 0
+
+    def store(self, obj):
+        raise NotImplementedError("")
+
+    def get(self, index):
+        raise NotImplementedError("")
+
+    def remove(self, index):
+        raise NotImplementedError("")
+
+    def length(self):
+        raise NotImplementedError("")
+
+    def sample(self, size):
+        raise NotImplementedError("")
+
+    def clean(self):
+        raise NotImplementedError("")
+
+
diff --git a/rl_coach/memories/non_episodic/__init__.py b/rl_coach/memories/non_episodic/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/memories/differentiable_neural_dictionary.py b/rl_coach/memories/non_episodic/differentiable_neural_dictionary.py
similarity index 58%
rename from memories/differentiable_neural_dictionary.py
rename to rl_coach/memories/non_episodic/differentiable_neural_dictionary.py
index 1a1fdc7..19d4364 100644
--- a/memories/differentiable_neural_dictionary.py
+++ b/rl_coach/memories/non_episodic/differentiable_neural_dictionary.py
@@ -14,22 +14,29 @@
 # limitations under the License.
 #
 
+import os
+import pickle
+
 import numpy as np
 from annoy import AnnoyIndex
-import os, pickle
 
 
 class AnnoyDictionary(object):
-    def __init__(self, dict_size, key_width, new_value_shift_coefficient=0.1, batch_size=100, key_error_threshold=0.01):
+    def __init__(self, dict_size, key_width, new_value_shift_coefficient=0.1, batch_size=100, key_error_threshold=0.01,
+                 num_neighbors=50, override_existing_keys=True, rebuild_on_every_update=False):
+        self.rebuild_on_every_update = rebuild_on_every_update
         self.max_size = dict_size
         self.curr_size = 0
         self.new_value_shift_coefficient = new_value_shift_coefficient
+        self.num_neighbors = num_neighbors
+        self.override_existing_keys = override_existing_keys
 
         self.index = AnnoyIndex(key_width, metric='euclidean')
         self.index.set_seed(1)
 
         self.embeddings = np.zeros((dict_size, key_width))
         self.values = np.zeros(dict_size)
+        self.additional_data = [None] * dict_size
 
         self.lru_timestamps = np.zeros(dict_size)
         self.current_timestamp = 0.0
@@ -45,15 +52,19 @@ class AnnoyDictionary(object):
 
         self.built_capacity = 0
 
-    def add(self, keys, values):
+    def add(self, keys, values, additional_data=None):
+        if not additional_data:
+            additional_data = [None] * len(keys)
+
         # Adds new embeddings and values to the dictionary
         indices = []
         indices_to_remove = []
         for i in range(keys.shape[0]):
             index = self._lookup_key_index(keys[i])
-            if index:
+            if index and self.override_existing_keys:
                 # update existing value
                 self.values[index] += self.new_value_shift_coefficient * (values[i] - self.values[index])
+                self.additional_data[index[0][0]] = additional_data[i]
                 self.lru_timestamps[index] = self.current_timestamp
                 indices_to_remove.append(i)
             else:
@@ -70,14 +81,18 @@ class AnnoyDictionary(object):
         for i in reversed(indices_to_remove):
             keys = np.delete(keys, i, 0)
             values = np.delete(values, i, 0)
+            del additional_data[i]
 
         self.buffered_keys = np.vstack((self.buffered_keys, keys))
         self.buffered_values = np.vstack((self.buffered_values, values))
         self.buffered_indices = self.buffered_indices + indices
+        self.buffered_additional_data = self.buffered_additional_data + additional_data
 
         if len(self.buffered_indices) >= self.min_update_size:
             self.min_update_size = max(self.initial_update_size, int(self.curr_size * 0.02))
             self._rebuild_index()
+        elif self.rebuild_on_every_update:
+            self._rebuild_index()
 
         self.current_timestamp += 1
 
@@ -86,24 +101,32 @@ class AnnoyDictionary(object):
         if not self.has_enough_entries(k):
             # this will only happen when the DND is not yet populated with enough entries, which is only during heatup
             # these values won't be used and therefore they are meaningless
-            return [0.0], [0.0], [0]
+            return [0.0], [0.0], [0], [None]
 
         _, indices = self._get_k_nearest_neighbors_indices(keys, k)
 
         embeddings = []
         values = []
+        additional_data = []
         for ind in indices:
             self.lru_timestamps[ind] = self.current_timestamp
             embeddings.append(self.embeddings[ind])
             values.append(self.values[ind])
+            curr_additional_data = []
+            for sub_ind in ind:
+                curr_additional_data.append(self.additional_data[sub_ind])
+            additional_data.append(curr_additional_data)
 
         self.current_timestamp += 1
 
-        return embeddings, values, indices
+        return embeddings, values, indices, additional_data
 
     def has_enough_entries(self, k):
         return self.curr_size > k and (self.built_capacity > k)
 
+    def sample_embeddings(self, num_embeddings):
+        return self.embeddings[np.random.choice(self.curr_size, num_embeddings)]
+
     def _get_k_nearest_neighbors_indices(self, keys, k):
         distances = []
         indices = []
@@ -117,18 +140,21 @@ class AnnoyDictionary(object):
         self.index.unbuild()
         self.embeddings[self.buffered_indices] = self.buffered_keys
         self.values[self.buffered_indices] = np.squeeze(self.buffered_values)
+        for i, data in zip(self.buffered_indices, self.buffered_additional_data):
+            self.additional_data[i] = data
         for idx, key in zip(self.buffered_indices, self.buffered_keys):
             self.index.add_item(idx, key)
 
         self._reset_buffer()
 
-        self.index.build(50)
+        self.index.build(self.num_neighbors)
         self.built_capacity = self.curr_size
 
     def _reset_buffer(self):
         self.buffered_keys = np.zeros((0, self.key_dimension))
         self.buffered_values = np.zeros((0, self.value_dimension))
         self.buffered_indices = []
+        self.buffered_additional_data = []
 
     def _lookup_key_index(self, key):
         distance, index = self._get_k_nearest_neighbors_indices([key], 1)
@@ -137,19 +163,30 @@ class AnnoyDictionary(object):
         return None
 
 
-class QDND:
+class QDND(object):
     def __init__(self, dict_size, key_width, num_actions, new_value_shift_coefficient=0.1, key_error_threshold=0.01,
-                 learning_rate=0.01):
+                 learning_rate=0.01, num_neighbors=50, return_additional_data=False, override_existing_keys=False,
+                 rebuild_on_every_update=False):
+        self.dict_size = dict_size
+        self.key_width = key_width
         self.num_actions = num_actions
-        self.dicts = []
+        self.new_value_shift_coefficient = new_value_shift_coefficient
+        self.key_error_threshold = key_error_threshold
         self.learning_rate = learning_rate
+        self.num_neighbors = num_neighbors
+        self.return_additional_data = return_additional_data
+        self.override_existing_keys = override_existing_keys
+        self.dicts = []
 
         # create a dict for each action
         for a in range(num_actions):
-            new_dict = AnnoyDictionary(dict_size, key_width, new_value_shift_coefficient, key_error_threshold=key_error_threshold)
+            new_dict = AnnoyDictionary(dict_size, key_width, new_value_shift_coefficient,
+                                       key_error_threshold=key_error_threshold, num_neighbors=num_neighbors,
+                                       override_existing_keys=override_existing_keys,
+                                       rebuild_on_every_update=rebuild_on_every_update)
             self.dicts.append(new_dict)
 
-    def add(self, embeddings, actions, values):
+    def add(self, embeddings, actions, values, additional_data=None):
         # add a new set of embeddings and values to each of the underlining dictionaries
         embeddings = np.array(embeddings)
         actions = np.array(actions)
@@ -158,8 +195,14 @@ class QDND:
             idx = np.where(actions == a)
             curr_action_embeddings = embeddings[idx]
             curr_action_values = np.expand_dims(values[idx], -1)
+            if additional_data:
+                curr_additional_data = []
+                for i in idx[0]:
+                    curr_additional_data.append(additional_data[i])
+            else:
+                curr_additional_data = None
 
-            self.dicts[a].add(curr_action_embeddings, curr_action_values)
+            self.dicts[a].add(curr_action_embeddings, curr_action_values, curr_additional_data)
         return True
 
     def query(self, embeddings, action, k):
@@ -167,13 +210,18 @@ class QDND:
         dnd_embeddings = []
         dnd_values = []
         dnd_indices = []
+        dnd_additional_data = []
         for i in range(len(embeddings)):
-            embedding, value, indices = self.dicts[action].query([embeddings[i]], k)
+            embedding, value, indices, additional_data = self.dicts[action].query([embeddings[i]], k)
             dnd_embeddings.append(embedding[0])
             dnd_values.append(value[0])
             dnd_indices.append(indices[0])
+            dnd_additional_data.append(additional_data[0])
 
-        return dnd_embeddings, dnd_values, dnd_indices
+        if self.return_additional_data:
+            return dnd_embeddings, dnd_values, dnd_indices, dnd_additional_data
+        else:
+            return dnd_embeddings, dnd_values, dnd_indices
 
     def has_enough_entries(self, k):
         # check if each of the action dictionaries has at least k entries
@@ -182,6 +230,38 @@ class QDND:
                 return False
         return True
 
+    def update_keys_and_values(self, actions, key_gradients, value_gradients, indices):
+        # Update DND keys and values
+        for batch_action, batch_keys, batch_values, batch_indices in zip(actions, key_gradients, value_gradients, indices):
+            # Update keys (embeddings) and values in DND
+            for i, index in enumerate(batch_indices):
+                self.dicts[batch_action].embeddings[index, :] -= self.learning_rate * batch_keys[i, :]
+                self.dicts[batch_action].values[index] -= self.learning_rate * batch_values[i]
+
+    def sample_embeddings(self, num_embeddings):
+        num_actions = len(self.dicts)
+        embeddings = []
+        num_embeddings_per_action = int(num_embeddings/num_actions)
+        for action in range(num_actions):
+            embeddings.append(self.dicts[action].sample_embeddings(num_embeddings_per_action))
+        embeddings = np.vstack(embeddings)
+
+        # the numbers did not divide nicely, let's just randomly sample some more embeddings
+        if num_embeddings_per_action * num_actions < num_embeddings:
+            action = np.random.randint(0, num_actions)
+            extra_embeddings = self.dicts[action].sample_embeddings(num_embeddings -
+                                                                   num_embeddings_per_action * num_actions)
+            embeddings = np.vstack([embeddings, extra_embeddings])
+        return embeddings
+
+    def clean(self):
+        # create a new dict for each action
+        self.dicts = []
+        for a in range(self.num_actions):
+            new_dict = AnnoyDictionary(self.dict_size, self.key_width, self.new_value_shift_coefficient,
+                                       key_error_threshold=self.key_error_threshold, num_neighbors=self.num_neighbors)
+            self.dicts.append(new_dict)
+
 
 def load_dnd(model_dir):
     max_id = 0
@@ -203,4 +283,4 @@ def load_dnd(model_dir):
 
             DND.dicts[a].index.build(50)
 
-    return DND
+    return DND
\ No newline at end of file
diff --git a/rl_coach/memories/non_episodic/experience_replay.py b/rl_coach/memories/non_episodic/experience_replay.py
new file mode 100644
index 0000000..2e8d22f
--- /dev/null
+++ b/rl_coach/memories/non_episodic/experience_replay.py
@@ -0,0 +1,220 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List, Tuple, Union, Dict, Any
+
+import numpy as np
+from rl_coach.utils import ReaderWriterLock
+
+from rl_coach.core_types import Transition
+from rl_coach.memories.memory import Memory, MemoryGranularity, MemoryParameters
+
+
+class ExperienceReplayParameters(MemoryParameters):
+    def __init__(self):
+        super().__init__()
+        self.max_size = (MemoryGranularity.Transitions, 1000000)
+        self.allow_duplicates_in_batch_sampling = True
+
+    @property
+    def path(self):
+        return 'rl_coach.memories.non_episodic.experience_replay:ExperienceReplay'
+
+
+class ExperienceReplay(Memory):
+    """
+    A regular replay buffer which stores transition without any additional structure
+    """
+    def __init__(self, max_size: Tuple[MemoryGranularity, int], allow_duplicates_in_batch_sampling: bool=True):
+        """
+        :param max_size: the maximum number of transitions or episodes to hold in the memory
+        :param allow_duplicates_in_batch_sampling: allow having the same transition multiple times in a batch
+        """
+        super().__init__(max_size)
+        if max_size[0] != MemoryGranularity.Transitions:
+            raise ValueError("Experience replay size can only be configured in terms of transitions")
+        self.transitions = []
+        self._num_transitions = 0
+        self.allow_duplicates_in_batch_sampling = allow_duplicates_in_batch_sampling
+
+        self.reader_writer_lock = ReaderWriterLock()
+
+    def length(self) -> int:
+        """
+        Get the number of transitions in the ER
+        """
+        return self.num_transitions()
+
+    def num_transitions(self) -> int:
+        """
+        Get the number of transitions in the ER
+        """
+        return self._num_transitions
+
+    def sample(self, size: int) -> List[Transition]:
+        """
+        Sample a batch of transitions form the replay buffer. If the requested size is larger than the number
+        of samples available in the replay buffer then the batch will return empty.
+        :param size: the size of the batch to sample
+        :param beta: the beta parameter used for importance sampling
+        :return: a batch (list) of selected transitions from the replay buffer
+        """
+        self.reader_writer_lock.lock_writing()
+
+        if self.allow_duplicates_in_batch_sampling:
+            transitions_idx = np.random.randint(self.num_transitions(), size=size)
+
+        else:
+            if self.num_transitions() >= size:
+                transitions_idx = np.random.choice(self.num_transitions(), size=size, replace=False)
+            else:
+                raise ValueError("The replay buffer cannot be sampled since there are not enough transitions yet. "
+                                 "There are currently {} transitions".format(self.num_transitions()))
+
+        batch = [self.transitions[i] for i in transitions_idx]
+
+        self.reader_writer_lock.release_writing()
+
+        return batch
+
+    def _enforce_max_length(self) -> None:
+        """
+        Make sure that the size of the replay buffer does not pass the maximum size allowed.
+        If it passes the max size, the oldest transition in the replay buffer will be removed.
+        This function does not use locks since it is only called internally
+        :return: None
+        """
+        granularity, size = self.max_size
+        if granularity == MemoryGranularity.Transitions:
+            while size != 0 and self.num_transitions() > size:
+                self.remove_transition(0, False)
+        else:
+            raise ValueError("The granularity of the replay buffer can only be set in terms of transitions")
+
+    def store(self, transition: Transition, lock: bool=True) -> None:
+        """
+        Store a new transition in the memory.
+        :param transition: a transition to store
+        :param lock: if true, will lock the readers writers lock. this can cause a deadlock if an inheriting class
+                     locks and then calls store with lock = True
+        :return: None
+        """
+        if lock:
+            self.reader_writer_lock.lock_writing_and_reading()
+
+        self._num_transitions += 1
+        self.transitions.append(transition)
+        self._enforce_max_length()
+
+        if lock:
+            self.reader_writer_lock.release_writing_and_reading()
+
+    def get_transition(self, transition_index: int, lock: bool=True) -> Union[None, Transition]:
+        """
+        Returns the transition in the given index. If the transition does not exist, returns None instead.
+        :param transition_index: the index of the transition to return
+        :param lock: use write locking if this is a shared memory
+        :return: the corresponding transition
+        """
+        if lock:
+            self.reader_writer_lock.lock_writing()
+
+        if self.length() == 0 or transition_index >= self.length():
+            transition = None
+        else:
+            transition = self.transitions[transition_index]
+
+        if lock:
+            self.reader_writer_lock.release_writing()
+
+        return transition
+
+    def remove_transition(self, transition_index: int, lock: bool=True) -> None:
+        """
+        Remove the transition in the given index.
+        This does not remove the transition from the segment trees! it is just used to remove the transition
+        from the transitions list
+        :param transition_index: the index of the transition to remove
+        :return: None
+        """
+        if lock:
+            self.reader_writer_lock.lock_writing_and_reading()
+
+        if self.num_transitions() > transition_index:
+            self._num_transitions -= 1
+            del self.transitions[transition_index]
+
+        if lock:
+            self.reader_writer_lock.release_writing_and_reading()
+
+    # for API compatibility
+    def get(self, transition_index: int, lock: bool=True) -> Union[None, Transition]:
+        """
+        Returns the transition in the given index. If the transition does not exist, returns None instead.
+        :param transition_index: the index of the transition to return
+        :return: the corresponding transition
+        """
+        return self.get_transition(transition_index, lock)
+
+    # for API compatibility
+    def remove(self, transition_index: int, lock: bool=True):
+        """
+        Remove the transition in the given index
+        :param transition_index: the index of the transition to remove
+        :return: None
+        """
+        self.remove_transition(transition_index, lock)
+
+    def update_last_transition_info(self, info: Dict[str, Any]) -> None:
+        """
+        Update the info of the last transition stored in the memory
+        :param info: the new info to append to the existing info
+        :return: None
+        """
+        self.reader_writer_lock.lock_writing_and_reading()
+
+        if self.length() == 0:
+            raise ValueError("There are no transition in the replay buffer")
+        self.transitions[-1].info.update(info)
+
+        self.reader_writer_lock.release_writing_and_reading()
+
+    def clean(self, lock: bool=True) -> None:
+        """
+        Clean the memory by removing all the episodes
+        :return: None
+        """
+        if lock:
+            self.reader_writer_lock.lock_writing_and_reading()
+
+        self.transitions = []
+        self._num_transitions = 0
+
+        if lock:
+            self.reader_writer_lock.release_writing_and_reading()
+
+    def mean_reward(self) -> np.ndarray:
+        """
+        Get the mean reward in the replay buffer
+        :return: the mean reward
+        """
+        self.reader_writer_lock.lock_writing()
+
+        mean = np.mean([transition.reward for transition in self.transitions])
+
+        self.reader_writer_lock.release_writing()
+
+        return mean
diff --git a/rl_coach/memories/non_episodic/prioritized_experience_replay.py b/rl_coach/memories/non_episodic/prioritized_experience_replay.py
new file mode 100644
index 0000000..2a6c2e8
--- /dev/null
+++ b/rl_coach/memories/non_episodic/prioritized_experience_replay.py
@@ -0,0 +1,292 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import operator
+import random
+from enum import Enum
+from typing import List, Tuple, Any
+
+import numpy as np
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import Schedule, ConstantSchedule
+
+from rl_coach.core_types import Transition
+from rl_coach.memories.non_episodic.experience_replay import ExperienceReplayParameters, ExperienceReplay
+
+
+class PrioritizedExperienceReplayParameters(ExperienceReplayParameters):
+    def __init__(self):
+        super().__init__()
+        self.max_size = (MemoryGranularity.Transitions, 1000000)
+        self.alpha = 0.6
+        self.beta = ConstantSchedule(0.4)
+        self.epsilon = 1e-6
+
+    @property
+    def path(self):
+        return 'rl_coach.memories.non_episodic.prioritized_experience_replay:PrioritizedExperienceReplay'
+
+
+class SegmentTree(object):
+    """
+    A tree which can be used as a min/max heap or a sum tree
+    Add or update item value - O(log N)
+    Sampling an item - O(log N)
+    """
+    class Operation(Enum):
+        MAX = {"operator": max, "initial_value": -float("inf")}
+        MIN = {"operator": min, "initial_value": float("inf")}
+        SUM = {"operator": operator.add, "initial_value": 0}
+
+    def __init__(self, size: int, operation: Operation):
+        self.next_leaf_idx_to_write = 0
+        self.size = size
+        if not (size > 0 and size & (size - 1) == 0):
+            raise ValueError("A segment tree size must be a positive power of 2. The given size is {}".format(self.size))
+        self.operation = operation
+        self.tree = np.ones(2 * size - 1) * self.operation.value['initial_value']
+        self.data = [None] * size
+
+    def _propagate(self, node_idx: int) -> None:
+        """
+        Propagate an update of a node's value to its parent node
+        :param node_idx: the index of the node that was updated
+        :return: None
+        """
+        parent = (node_idx - 1) // 2
+
+        self.tree[parent] = self.operation.value['operator'](self.tree[parent * 2 + 1], self.tree[parent * 2 + 2])
+
+        if parent != 0:
+            self._propagate(parent)
+
+    def _retrieve(self, root_node_idx: int, val: float)-> int:
+        """
+        Retrieve the first node that has a value larger than val and is a child of the node at index idx
+        :param root_node_idx: the index of the root node to search from
+        :param val: the value to query for
+        :return: the index of the resulting node
+        """
+        left = 2 * root_node_idx + 1
+        right = left + 1
+
+        if left >= len(self.tree):
+            return root_node_idx
+
+        if val <= self.tree[left]:
+            return self._retrieve(left, val)
+        else:
+            return self._retrieve(right, val-self.tree[left])
+
+    def total_value(self) -> float:
+        """
+        Return the total value of the tree according to the tree operation. For SUM for example, this will return
+        the total sum of the tree. for MIN, this will return the minimal value
+        :return: the total value of the tree
+        """
+        return self.tree[0]
+
+    def add(self, val: float, data: Any) -> None:
+        """
+        Add a new value to the tree with data assigned to it
+        :param val: the new value to add to the tree
+        :param data: the data that should be assigned to this value
+        :return: None
+        """
+        self.data[self.next_leaf_idx_to_write] = data
+        self.update(self.next_leaf_idx_to_write, val)
+
+        self.next_leaf_idx_to_write += 1
+        if self.next_leaf_idx_to_write >= self.size:
+            self.next_leaf_idx_to_write = 0
+
+    def update(self, leaf_idx: int, new_val: float) -> None:
+        """
+        Update the value of the node at index idx
+        :param leaf_idx: the index of the node to update
+        :param new_val: the new value of the node
+        :return: None
+        """
+        node_idx = leaf_idx + self.size - 1
+        if not 0 <= node_idx < len(self.tree):
+            raise ValueError("The given left index ({}) can not be found in the tree. The available leaves are: 0-{}"
+                             .format(leaf_idx, self.size - 1))
+
+        self.tree[node_idx] = new_val
+        self._propagate(node_idx)
+
+    def get(self, val: float) -> Tuple[int, float, Any]:
+        """
+        Given a value between 0 and the tree sum, return the object which this value is in it's range.
+        For example, if we have 3 leaves: 10, 20, 30, and val=35, this will return the 3rd leaf, by accumulating
+        leaves by their order until getting to 35. This allows sampling leaves according to their proportional
+        probability.
+        :param val: a value within the range 0 and the tree sum
+        :return: the index of the resulting leaf in the tree, it's probability and
+                 the object itself
+        """
+        node_idx = self._retrieve(0, val)
+        leaf_idx = node_idx - self.size + 1
+        data_value = self.tree[node_idx]
+        data = self.data[leaf_idx]
+
+        return leaf_idx, data_value, data
+
+    def __str__(self):
+        result = ""
+        start = 0
+        size = 1
+        while size <= self.size:
+            result += "{}\n".format(self.tree[start:(start + size)])
+            start += size
+            size *= 2
+        return result
+
+
+class PrioritizedExperienceReplay(ExperienceReplay):
+    """
+    This is the proportional sampling variant of the prioritized experience replay as described
+    in https://arxiv.org/pdf/1511.05952.pdf.
+    """
+    def __init__(self, max_size: Tuple[MemoryGranularity, int], alpha: float=0.6, beta: Schedule=ConstantSchedule(0.4),
+                 epsilon: float=1e-6, allow_duplicates_in_batch_sampling: bool=True):
+        """
+        :param max_size: the maximum number of transitions or episodes to hold in the memory
+        :param alpha: the alpha prioritization coefficient
+        :param beta: the beta parameter used for importance sampling
+        :param epsilon: a small value added to the priority of each transition
+        :param allow_duplicates_in_batch_sampling: allow having the same transition multiple times in a batch
+        """
+        if max_size[0] != MemoryGranularity.Transitions:
+            raise ValueError("Prioritized Experience Replay currently only support setting the memory size in "
+                             "transitions granularity.")
+        self.power_of_2_size = 1
+        while self.power_of_2_size < max_size[1]:
+            self.power_of_2_size *= 2
+        super().__init__((MemoryGranularity.Transitions, self.power_of_2_size), allow_duplicates_in_batch_sampling)
+        self.sum_tree = SegmentTree(self.power_of_2_size, SegmentTree.Operation.SUM)
+        self.min_tree = SegmentTree(self.power_of_2_size, SegmentTree.Operation.MIN)
+        self.max_tree = SegmentTree(self.power_of_2_size, SegmentTree.Operation.MAX)
+        self.alpha = alpha
+        self.beta = beta
+        self.epsilon = epsilon
+        self.maximal_priority = 1.0
+
+    def _update_priority(self, leaf_idx: int, error: float) -> None:
+        """
+        Update the priority of a given transition, using its index in the tree and its error
+        :param leaf_idx: the index of the transition leaf in the tree
+        :param error: the new error value
+        :return: None
+        """
+        if error < 0:
+            raise ValueError("The priorities must be non-negative values")
+        priority = (error + self.epsilon)
+        self.sum_tree.update(leaf_idx, priority ** self.alpha)
+        self.min_tree.update(leaf_idx, priority ** self.alpha)
+        self.max_tree.update(leaf_idx, priority)
+        self.maximal_priority = self.max_tree.total_value()
+
+    def update_priorities(self, indices: List[int], error_values: List[float]) -> None:
+        """
+        Update the priorities of a batch of transitions using their indices and their new TD error terms
+        :param indices: the indices of the transitions to update
+        :param error_values: the new error values
+        :return: None
+        """
+        self.reader_writer_lock.lock_writing_and_reading()
+
+        if len(indices) != len(error_values):
+            raise ValueError("The number of indexes requested for update don't match the number of error values given")
+        for transition_idx, error in zip(indices, error_values):
+            self._update_priority(transition_idx, error)
+
+        self.reader_writer_lock.release_writing_and_reading()
+
+    def sample(self, size: int) -> List[Transition]:
+        """
+        Sample a batch of transitions form the replay buffer. If the requested size is larger than the number
+        of samples available in the replay buffer then the batch will return empty.
+        :param size: the size of the batch to sample
+        :return: a batch (list) of selected transitions from the replay buffer
+        """
+
+        self.reader_writer_lock.lock_writing()
+
+        if self.num_transitions() >= size:
+            # split the tree leaves to equal segments and sample one transition from each segment
+            batch = []
+            segment_size = self.sum_tree.total_value() / size
+
+            # get the maximum weight in the memory
+            min_probability = self.min_tree.total_value() / self.sum_tree.total_value()  # min P(j) = min p^a / sum(p^a)
+            max_weight = (min_probability * self.num_transitions()) ** -self.beta.current_value  # max wi
+
+            # sample a batch
+            for i in range(size):
+                start_probability = segment_size * i
+                end_probability = segment_size * (i + 1)
+
+                # sample leaf and calculate its weight
+                val = random.uniform(start_probability, end_probability)
+                leaf_idx, priority, transition = self.sum_tree.get(val)
+                priority /= self.sum_tree.total_value()   # P(j) = p^a / sum(p^a)
+                weight = (self.num_transitions() * priority) ** -self.beta.current_value  # (N * P(j)) ^ -beta
+                normalized_weight = weight / max_weight  # wj = ((N * P(j)) ^ -beta) / max wi
+
+                transition.info['idx'] = leaf_idx
+                transition.info['weight'] = normalized_weight
+
+                batch.append(transition)
+
+            self.beta.step()
+
+        else:
+            raise ValueError("The replay buffer cannot be sampled since there are not enough transitions yet. "
+                             "There are currently {} transitions".format(self.num_transitions()))
+
+        self.reader_writer_lock.release_writing()
+        return batch
+
+    def store(self, transition: Transition) -> None:
+        """
+        Store a new transition in the memory.
+        :param transition: a transition to store
+        :return: None
+        """
+        self.reader_writer_lock.lock_writing_and_reading()
+
+        transition_priority = self.maximal_priority
+        self.sum_tree.add(transition_priority ** self.alpha, transition)
+        self.min_tree.add(transition_priority ** self.alpha, transition)
+        self.max_tree.add(transition_priority, transition)
+        super().store(transition, False)
+
+        self.reader_writer_lock.release_writing_and_reading()
+
+    def clean(self) -> None:
+        """
+        Clean the memory by removing all the episodes
+        :return: None
+        """
+        self.reader_writer_lock.lock_writing_and_reading()
+
+        super().clean(lock=False)
+        self.sum_tree = SegmentTree(self.power_of_2_size, SegmentTree.Operation.SUM)
+        self.min_tree = SegmentTree(self.power_of_2_size, SegmentTree.Operation.MIN)
+        self.max_tree = SegmentTree(self.power_of_2_size, SegmentTree.Operation.MAX)
+
+        self.reader_writer_lock.release_writing_and_reading()
diff --git a/plot_atari.py b/rl_coach/plot_atari.py
similarity index 65%
rename from plot_atari.py
rename to rl_coach/plot_atari.py
index 8732fb0..9ae0e24 100644
--- a/plot_atari.py
+++ b/rl_coach/plot_atari.py
@@ -1,12 +1,30 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
 import argparse
+import os
+
 import matplotlib
 import matplotlib.pyplot as plt
-from dashboard import SignalsFile
-import os
+
+from rl_coach.dashboard_components.signals_file import SignalsFile
 
 
 class FigureMaker(object):
-    def __init__(self, path, cols, smoothness, signal_to_plot, x_axis):
+    def __init__(self, path, cols, smoothness, signal_to_plot, x_axis, color):
         self.experiments_path = path
         self.environments = self.list_environments()
         self.cols = cols
@@ -14,6 +32,7 @@ class FigureMaker(object):
         self.smoothness = smoothness
         self.signal_to_plot = signal_to_plot
         self.x_axis = x_axis
+        self.color = color
 
         params = {
             'axes.labelsize': 8,
@@ -27,14 +46,14 @@ class FigureMaker(object):
         matplotlib.rcParams.update(params)
 
     def list_environments(self):
-        environments = sorted([e.name for e in os.scandir(args.path) if e.is_dir()])
+        environments = sorted([e.name for e in os.scandir(self.experiments_path) if e.is_dir()])
         filtered_environments = self.filter_environments(environments)
         return filtered_environments
 
     def filter_environments(self, environments):
         filtered_environments = []
         for idx, environment in enumerate(environments):
-            path = os.path.join(args.path, environment)
+            path = os.path.join(self.experiments_path, environment)
             experiments = [e.name for e in os.scandir(path) if e.is_dir()]
 
             # take only the last updated experiment directory
@@ -48,10 +67,20 @@ class FigureMaker(object):
 
         return filtered_environments
 
-    def plot_figures(self):
+    def plot_figures(self, prev_subplot_map=None):
+        subplot_map = {}
         for idx, (environment, full_file_path) in enumerate(self.environments):
+            environment = environment.split('level')[1].split('-')[1].split('Deterministic')[0][1:]
+            if prev_subplot_map:
+                # skip on environments which were not plotted before
+                if environment not in prev_subplot_map.keys():
+                    continue
+                subplot_idx = prev_subplot_map[environment]
+            else:
+                subplot_idx = idx + 1
             print(environment)
-            axis = plt.subplot(self.rows, self.cols, idx + 1)
+            axis = plt.subplot(self.rows, self.cols, subplot_idx)
+            subplot_map[environment] = subplot_idx
             signals = SignalsFile(full_file_path)
             signals.change_averaging_window(self.smoothness, force=True, signals=[self.signal_to_plot])
             steps = signals.bokeh_source.data[self.x_axis]
@@ -61,9 +90,11 @@ class FigureMaker(object):
             axis.yaxis.set_major_locator(yloc)
             axis.ticklabel_format(style='sci', axis='x', scilimits=(0, 0))
             plt.title(environment, fontsize=10, y=1.08)
-            plt.plot(steps, rewards, linewidth=0.8)
+            plt.plot(steps, rewards, self.color, linewidth=0.8)
             plt.subplots_adjust(hspace=2.0, wspace=0.4)
 
+        return subplot_map
+
     def save_pdf(self, name):
         plt.savefig(name + ".pdf", bbox_inches='tight')
 
@@ -73,7 +104,7 @@ class FigureMaker(object):
 
 if __name__ == "__main__":
     parser = argparse.ArgumentParser()
-    parser.add_argument('-p', '--path',
+    parser.add_argument('-p', '--paths',
                         help="(string) Root directory of the experiments",
                         default=None,
                         type=str)
@@ -83,7 +114,7 @@ if __name__ == "__main__":
                         type=int)
     parser.add_argument('-s', '--smoothness',
                         help="(int) Number of consequent episodes to average over",
-                        default=200,
+                        default=100,
                         type=int)
     parser.add_argument('-sig', '--signal',
                         help="(str) The name of the signal to plot",
@@ -99,7 +130,11 @@ if __name__ == "__main__":
                         type=str)
     args = parser.parse_args()
 
-    maker = FigureMaker(args.path, cols=args.cols, smoothness=args.smoothness, signal_to_plot=args.signal, x_axis=args.x_axis)
-    maker.plot_figures()
+    paths = args.paths.split(",")
+    subplot_map = None
+    for idx, path in enumerate(paths):
+        maker = FigureMaker(path, cols=args.cols, smoothness=args.smoothness, signal_to_plot=args.signal, x_axis=args.x_axis, color='C{}'.format(idx))
+        subplot_map = maker.plot_figures(subplot_map)
+    plt.legend(paths)
     maker.save_pdf(args.pdf)
     maker.show_figures()
diff --git a/rl_coach/presets/Atari_A3C.py b/rl_coach/presets/Atari_A3C.py
new file mode 100644
index 0000000..4e92a0a
--- /dev/null
+++ b/rl_coach/presets/Atari_A3C.py
@@ -0,0 +1,53 @@
+from rl_coach.agents.actor_critic_agent import ActorCriticAgentParameters
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SingleLevelSelection, SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.exploration_policies.categorical import CategoricalParameters
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(100)
+schedule_params.evaluation_steps = EnvironmentEpisodes(3)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = ActorCriticAgentParameters()
+
+agent_params.algorithm.apply_gradients_every_x_episodes = 1
+agent_params.algorithm.num_steps_between_gradient_updates = 20
+agent_params.algorithm.beta_entropy = 0.05
+
+agent_params.network_wrappers['main'].middleware_parameters = FCMiddlewareParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+
+agent_params.exploration = CategoricalParameters()
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_A3C_LSTM.py b/rl_coach/presets/Atari_A3C_LSTM.py
new file mode 100644
index 0000000..c79dff7
--- /dev/null
+++ b/rl_coach/presets/Atari_A3C_LSTM.py
@@ -0,0 +1,55 @@
+from rl_coach.agents.actor_critic_agent import ActorCriticAgentParameters
+from rl_coach.architectures.tensorflow_components.middlewares.lstm_middleware import LSTMMiddlewareParameters
+from rl_coach.base_parameters import VisualizationParameters, MiddlewareScheme, PresetValidationParameters
+from rl_coach.environments.environment import SingleLevelSelection, SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4, AtariInputFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.exploration_policies.categorical import CategoricalParameters
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(100)
+schedule_params.evaluation_steps = EnvironmentEpisodes(3)
+schedule_params.heatup_steps = EnvironmentSteps(10000)
+
+#########
+# Agent #
+#########
+agent_params = ActorCriticAgentParameters()
+
+agent_params.algorithm.apply_gradients_every_x_episodes = 1
+agent_params.algorithm.num_steps_between_gradient_updates = 20
+agent_params.algorithm.beta_entropy = 0.05
+
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.network_wrappers['main'].middleware_parameters = LSTMMiddlewareParameters(scheme=MiddlewareScheme.Medium,
+                                                                                       number_of_lstm_cells=256)
+agent_params.input_filter = AtariInputFilter()
+agent_params.input_filter.remove_observation_filter('observation', 'stacking')
+agent_params.exploration = CategoricalParameters()
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = True
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_Bootstrapped_DQN.py b/rl_coach/presets/Atari_Bootstrapped_DQN.py
new file mode 100644
index 0000000..7c0b0f4
--- /dev/null
+++ b/rl_coach/presets/Atari_Bootstrapped_DQN.py
@@ -0,0 +1,44 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.bootstrapped_dqn_agent import BootstrappedDQNAgentParameters
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = BootstrappedDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_C51.py b/rl_coach/presets/Atari_C51.py
new file mode 100644
index 0000000..a4d117a
--- /dev/null
+++ b/rl_coach/presets/Atari_C51.py
@@ -0,0 +1,43 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.categorical_dqn_agent import CategoricalDQNAgentParameters
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = CategoricalDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_DDQN.py b/rl_coach/presets/Atari_DDQN.py
new file mode 100644
index 0000000..59e9cd8
--- /dev/null
+++ b/rl_coach/presets/Atari_DDQN.py
@@ -0,0 +1,43 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.ddqn_agent import DDQNAgentParameters
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = DDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
\ No newline at end of file
diff --git a/rl_coach/presets/Atari_DDQN_with_PER.py b/rl_coach/presets/Atari_DDQN_with_PER.py
new file mode 100644
index 0000000..f7d59ed
--- /dev/null
+++ b/rl_coach/presets/Atari_DDQN_with_PER.py
@@ -0,0 +1,48 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.non_episodic.prioritized_experience_replay import PrioritizedExperienceReplayParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.ddqn_agent import DDQNAgentParameters
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = DDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025/4
+agent_params.memory = PrioritizedExperienceReplayParameters()
+agent_params.memory.beta = LinearSchedule(0.4, 1, 12500000)  # 12.5M training iterations = 50M steps = 200M frames
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_DQN.py b/rl_coach/presets/Atari_DQN.py
new file mode 100644
index 0000000..989f6b5
--- /dev/null
+++ b/rl_coach/presets/Atari_DQN.py
@@ -0,0 +1,43 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = DQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_DQN_with_PER.py b/rl_coach/presets/Atari_DQN_with_PER.py
new file mode 100644
index 0000000..85f38e8
--- /dev/null
+++ b/rl_coach/presets/Atari_DQN_with_PER.py
@@ -0,0 +1,47 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.non_episodic.prioritized_experience_replay import PrioritizedExperienceReplayParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = DQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.memory = PrioritizedExperienceReplayParameters()
+agent_params.memory.beta = LinearSchedule(0.4, 1, 12500000)  # 12.5M training iterations = 50M steps = 200M frames
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_Dueling_DDQN.py b/rl_coach/presets/Atari_Dueling_DDQN.py
new file mode 100644
index 0000000..cc72655
--- /dev/null
+++ b/rl_coach/presets/Atari_Dueling_DDQN.py
@@ -0,0 +1,50 @@
+import math
+
+from rl_coach.architectures.tensorflow_components.heads.dueling_q_head import DuelingQHeadParameters
+from rl_coach.base_parameters import VisualizationParameters, MiddlewareScheme, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.ddqn_agent import DDQNAgentParameters
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = DDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.network_wrappers['main'].middleware_parameters.scheme = MiddlewareScheme.Empty
+agent_params.network_wrappers['main'].heads_parameters = [DuelingQHeadParameters()]
+agent_params.network_wrappers['main'].rescale_gradient_from_head_by_factor = [1/math.sqrt(2)]
+agent_params.network_wrappers['main'].clip_gradients = 10
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_Dueling_DDQN_with_PER_OpenAI.py b/rl_coach/presets/Atari_Dueling_DDQN_with_PER_OpenAI.py
new file mode 100644
index 0000000..5ceab9f
--- /dev/null
+++ b/rl_coach/presets/Atari_Dueling_DDQN_with_PER_OpenAI.py
@@ -0,0 +1,57 @@
+from rl_coach.architectures.tensorflow_components.heads.dueling_q_head import DuelingQHeadParameters
+from rl_coach.base_parameters import VisualizationParameters, MiddlewareScheme, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.non_episodic.prioritized_experience_replay import PrioritizedExperienceReplayParameters
+from rl_coach.schedules import LinearSchedule, PieceWiseSchedule, ConstantSchedule
+
+from rl_coach.agents.ddqn_agent import DDQNAgentParameters
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = DDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.network_wrappers['main'].middleware_parameters.scheme = MiddlewareScheme.Empty
+agent_params.network_wrappers['main'].heads_parameters = [DuelingQHeadParameters()]
+agent_params.network_wrappers['main'].clip_gradients = 10
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(40000)
+agent_params.exploration.epsilon_schedule = PieceWiseSchedule(
+    [(LinearSchedule(1, 0.1, 1000000), EnvironmentSteps(1000000)),
+     (LinearSchedule(0.1, 0.01, 10000000), EnvironmentSteps(1000000)),
+     (ConstantSchedule(0.001), EnvironmentSteps(10000000))]
+)
+agent_params.memory = PrioritizedExperienceReplayParameters()
+agent_params.memory.beta = LinearSchedule(0.4, 1, 12500000)  # 12.5M training iterations = 50M steps = 200M frames
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_NEC.py b/rl_coach/presets/Atari_NEC.py
new file mode 100644
index 0000000..7be1462
--- /dev/null
+++ b/rl_coach/presets/Atari_NEC.py
@@ -0,0 +1,47 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SingleLevelSelection, SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import Atari, AtariInputFilter, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.nec_agent import NECAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(10000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(100)
+schedule_params.evaluation_steps = EnvironmentEpisodes(3)
+schedule_params.heatup_steps = EnvironmentSteps(2000)
+
+#########
+# Agent #
+#########
+agent_params = NECAgentParameters()
+
+agent_params.network_wrappers['main'].learning_rate = 0.00001
+agent_params.input_filter = AtariInputFilter()
+agent_params.input_filter.remove_reward_filter('clipping')
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+env_params.random_initialization_steps = 1
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_NStepQ.py b/rl_coach/presets/Atari_NStepQ.py
new file mode 100644
index 0000000..9ef3144
--- /dev/null
+++ b/rl_coach/presets/Atari_NStepQ.py
@@ -0,0 +1,48 @@
+from rl_coach.architectures.tensorflow_components.architecture import Conv2d, Dense
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SingleLevelSelection, SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.n_step_q_agent import NStepQAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(100)
+schedule_params.evaluation_steps = EnvironmentEpisodes(3)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = NStepQAgentParameters()
+
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].scheme = [Conv2d([16, 8, 4]),
+                                                                                          Conv2d([32, 4, 2])]
+agent_params.network_wrappers['main'].middleware_parameters.scheme = [Dense([256])]
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_QR_DQN.py b/rl_coach/presets/Atari_QR_DQN.py
new file mode 100644
index 0000000..759b0d1
--- /dev/null
+++ b/rl_coach/presets/Atari_QR_DQN.py
@@ -0,0 +1,44 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.qr_dqn_agent import QuantileRegressionDQNAgentParameters
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = QuantileRegressionDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00005  # called alpha in the paper
+agent_params.algorithm.huber_loss_interval = 1  # k = 0 for strict quantile loss, k = 1 for Huber quantile loss
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Atari_UCB_with_Q_Ensembles.py b/rl_coach/presets/Atari_UCB_with_Q_Ensembles.py
new file mode 100644
index 0000000..cdc5340
--- /dev/null
+++ b/rl_coach/presets/Atari_UCB_with_Q_Ensembles.py
@@ -0,0 +1,45 @@
+from rl_coach.agents.bootstrapped_dqn_agent import BootstrappedDQNAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import EnvironmentSteps, RunPhase
+from rl_coach.exploration_policies.ucb import UCBParameters
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(50000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)
+schedule_params.evaluation_steps = EnvironmentSteps(135000)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = BootstrappedDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.exploration = UCBParameters()
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = SingleLevelSelection(atari_deterministic_v4)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.trace_test_levels = ['breakout', 'pong', 'alien']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/BitFlip_DQN.py b/rl_coach/presets/BitFlip_DQN.py
new file mode 100644
index 0000000..fc0cf01
--- /dev/null
+++ b/rl_coach/presets/BitFlip_DQN.py
@@ -0,0 +1,68 @@
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.base_parameters import VisualizationParameters, EmbedderScheme, InputEmbedderParameters, \
+    PresetValidationParameters
+from rl_coach.environments.gym_environment import Mujoco
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import ConstantSchedule
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps
+
+bit_length = 8
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(400000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(16 * 50)  # 50 cycles
+schedule_params.evaluation_steps = EnvironmentEpisodes(10)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = DQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.001
+agent_params.network_wrappers['main'].batch_size = 128
+agent_params.network_wrappers['main'].middleware_parameters.scheme = [Dense([256])]
+agent_params.network_wrappers['main'].input_embedders_parameters = {
+    'state': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+    'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)
+}
+agent_params.algorithm.discount = 0.98
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentEpisodes(16)
+agent_params.algorithm.num_consecutive_training_steps = 40
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = TrainingSteps(40)
+agent_params.algorithm.rate_for_copying_weights_to_target = 0.05
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 10**6)
+agent_params.exploration.epsilon_schedule = ConstantSchedule(0.2)
+agent_params.exploration.evaluation_epsilon = 0
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = 'rl_coach.environments.toy_problems.bit_flip:BitFlip'
+env_params.additional_simulator_parameters = {'bit_length': bit_length, 'mean_zero': True}
+# env_params.custom_reward_threshold = -bit_length + 1
+
+vis_params = VisualizationParameters()
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = -7.9
+preset_validation_params.max_episodes_to_achieve_reward = 10000
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
+
+
+# self.algorithm.add_intrinsic_reward_for_reaching_the_goal = False
+
diff --git a/rl_coach/presets/BitFlip_DQN_HER.py b/rl_coach/presets/BitFlip_DQN_HER.py
new file mode 100644
index 0000000..df74366
--- /dev/null
+++ b/rl_coach/presets/BitFlip_DQN_HER.py
@@ -0,0 +1,81 @@
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.base_parameters import VisualizationParameters, EmbedderScheme, InputEmbedderParameters, \
+    PresetValidationParameters
+from rl_coach.environments.gym_environment import Mujoco
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.episodic.episodic_hindsight_experience_replay import \
+    EpisodicHindsightExperienceReplayParameters, HindsightGoalSelectionMethod
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import ConstantSchedule
+from rl_coach.spaces import GoalsSpace, ReachingGoal
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps
+
+bit_length = 20
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentEpisodes(16 * 50 * 200)  # 200 epochs
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(16 * 50)  # 50 cycles
+schedule_params.evaluation_steps = EnvironmentEpisodes(10)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = DQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.001
+agent_params.network_wrappers['main'].batch_size = 128
+agent_params.network_wrappers['main'].middleware_parameters.scheme = [Dense([256])]
+agent_params.network_wrappers['main'].input_embedders_parameters = {
+    'state': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+    'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)}
+agent_params.algorithm.discount = 0.98
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentEpisodes(16)
+agent_params.algorithm.num_consecutive_training_steps = 40
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = TrainingSteps(40)
+agent_params.algorithm.rate_for_copying_weights_to_target = 0.05
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 10**6)
+agent_params.exploration.epsilon_schedule = ConstantSchedule(0.2)
+agent_params.exploration.evaluation_epsilon = 0
+
+agent_params.memory = EpisodicHindsightExperienceReplayParameters()
+agent_params.memory.hindsight_goal_selection_method = HindsightGoalSelectionMethod.Final
+agent_params.memory.hindsight_transitions_per_regular_transition = 1
+agent_params.memory.goals_space = GoalsSpace(goal_name='state',
+                                                    reward_type=ReachingGoal(distance_from_goal_threshold=0,
+                                                          goal_reaching_reward=0,
+                                                          default_reward=-1),
+                                                    distance_metric=GoalsSpace.DistanceMetric.Euclidean)
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = 'rl_coach.environments.toy_problems.bit_flip:BitFlip'
+env_params.additional_simulator_parameters = {'bit_length': bit_length, 'mean_zero': True}
+env_params.custom_reward_threshold = -bit_length + 1
+
+vis_params = VisualizationParameters()
+
+# currently no tests for this preset as the max reward can be accidently achieved. will be fixed with trace based tests.
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = -15
+preset_validation_params.max_episodes_to_achieve_reward = 10000
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
+
+
+# self.algorithm.add_intrinsic_reward_for_reaching_the_goal = False
+
diff --git a/rl_coach/presets/Carla_3_Cameras_DDPG.py b/rl_coach/presets/Carla_3_Cameras_DDPG.py
new file mode 100644
index 0000000..19ca370
--- /dev/null
+++ b/rl_coach/presets/Carla_3_Cameras_DDPG.py
@@ -0,0 +1,61 @@
+import copy
+
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.carla_environment import CarlaEnvironmentParameters, CameraTypes, CarlaInputFilter
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.ddpg_agent import DDPGAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = DDPGAgentParameters()
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(4)
+
+# front camera
+agent_params.network_wrappers['actor'].input_embedders_parameters['forward_camera'] = \
+    agent_params.network_wrappers['actor'].input_embedders_parameters.pop('observation')
+agent_params.network_wrappers['critic'].input_embedders_parameters['forward_camera'] = \
+    agent_params.network_wrappers['critic'].input_embedders_parameters.pop('observation')
+
+# left camera
+agent_params.network_wrappers['actor'].input_embedders_parameters['left_camera'] = \
+    copy.deepcopy(agent_params.network_wrappers['actor'].input_embedders_parameters['forward_camera'])
+agent_params.network_wrappers['critic'].input_embedders_parameters['left_camera'] = \
+    copy.deepcopy(agent_params.network_wrappers['critic'].input_embedders_parameters['forward_camera'])
+
+# right camera
+agent_params.network_wrappers['actor'].input_embedders_parameters['right_camera'] = \
+    copy.deepcopy(agent_params.network_wrappers['actor'].input_embedders_parameters['forward_camera'])
+agent_params.network_wrappers['critic'].input_embedders_parameters['right_camera'] = \
+    copy.deepcopy(agent_params.network_wrappers['critic'].input_embedders_parameters['forward_camera'])
+
+agent_params.input_filter = CarlaInputFilter()
+agent_params.input_filter.copy_filters_from_one_observation_to_another('forward_camera', 'left_camera')
+agent_params.input_filter.copy_filters_from_one_observation_to_another('forward_camera', 'right_camera')
+
+###############
+# Environment #
+###############
+env_params = CarlaEnvironmentParameters()
+env_params.level = 'town1'
+env_params.cameras = [CameraTypes.FRONT, CameraTypes.LEFT, CameraTypes.RIGHT]
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/Carla_DDPG.py b/rl_coach/presets/Carla_DDPG.py
new file mode 100644
index 0000000..820d22a
--- /dev/null
+++ b/rl_coach/presets/Carla_DDPG.py
@@ -0,0 +1,40 @@
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.carla_environment import CarlaEnvironmentParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.ddpg_agent import DDPGAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = DDPGAgentParameters()
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(4)
+agent_params.network_wrappers['actor'].input_embedders_parameters['forward_camera'] = \
+    agent_params.network_wrappers['actor'].input_embedders_parameters.pop('observation')
+agent_params.network_wrappers['critic'].input_embedders_parameters['forward_camera'] = \
+    agent_params.network_wrappers['critic'].input_embedders_parameters.pop('observation')
+
+###############
+# Environment #
+###############
+env_params = CarlaEnvironmentParameters()
+env_params.level = 'town1'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/Carla_Dueling_DDQN.py b/rl_coach/presets/Carla_Dueling_DDQN.py
new file mode 100644
index 0000000..7c40820
--- /dev/null
+++ b/rl_coach/presets/Carla_Dueling_DDQN.py
@@ -0,0 +1,51 @@
+import math
+
+from rl_coach.architectures.tensorflow_components.heads.dueling_q_head import DuelingQHeadParameters
+from rl_coach.base_parameters import VisualizationParameters, MiddlewareScheme
+from rl_coach.environments.carla_environment import CarlaEnvironmentParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod
+from rl_coach.filters.action.box_discretization import BoxDiscretization
+from rl_coach.filters.filter import OutputFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.ddqn_agent import DDQNAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = DDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.network_wrappers['main'].heads_parameters = [DuelingQHeadParameters()]
+agent_params.network_wrappers['main'].middleware_parameters.scheme = MiddlewareScheme.Empty
+agent_params.network_wrappers['main'].rescale_gradient_from_head_by_factor = [1/math.sqrt(2), 1/math.sqrt(2)]
+agent_params.network_wrappers['main'].clip_gradients = 10
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(4)
+agent_params.network_wrappers['main'].input_embedders_parameters['forward_camera'] = \
+    agent_params.network_wrappers['main'].input_embedders_parameters.pop('observation')
+agent_params.output_filter = OutputFilter()
+agent_params.output_filter.add_action_filter('discretization', BoxDiscretization(5))
+
+###############
+# Environment #
+###############
+env_params = CarlaEnvironmentParameters()
+env_params.level = 'town1'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/CartPole_A3C.py b/rl_coach/presets/CartPole_A3C.py
new file mode 100644
index 0000000..9b063d7
--- /dev/null
+++ b/rl_coach/presets/CartPole_A3C.py
@@ -0,0 +1,63 @@
+from rl_coach.agents.actor_critic_agent import ActorCriticAgentParameters
+from rl_coach.agents.policy_optimization_agent import PolicyGradientRescaler
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import MujocoInputFilter, Mujoco
+from rl_coach.exploration_policies.categorical import CategoricalParameters
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = ActorCriticAgentParameters()
+
+agent_params.algorithm.policy_gradient_rescaler = PolicyGradientRescaler.GAE
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.apply_gradients_every_x_episodes = 1
+agent_params.algorithm.num_steps_between_gradient_updates = 5
+agent_params.algorithm.gae_lambda = 1
+agent_params.algorithm.beta_entropy = 0.01
+
+agent_params.network_wrappers['main'].optimizer_type = 'Adam'
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_reward_filter('rescale', RewardRescaleFilter(1/200.))
+
+agent_params.exploration = CategoricalParameters()
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = 'CartPole-v0'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 150
+preset_validation_params.max_episodes_to_achieve_reward = 300
+preset_validation_params.num_workers = 8
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/CartPole_DFP.py b/rl_coach/presets/CartPole_DFP.py
new file mode 100644
index 0000000..4bd88de
--- /dev/null
+++ b/rl_coach/presets/CartPole_DFP.py
@@ -0,0 +1,58 @@
+from rl_coach.base_parameters import VisualizationParameters, EmbedderScheme, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import Mujoco
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.dfp_agent import DFPAgentParameters, HandlingTargetsAfterEpisodeEnd
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(100)
+
+
+#########
+# Agent #
+#########
+agent_params = DFPAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].scheme = EmbedderScheme.Medium
+agent_params.network_wrappers['main'].input_embedders_parameters['goal'].scheme = EmbedderScheme.Medium
+agent_params.network_wrappers['main'].input_embedders_parameters['measurements'].scheme = EmbedderScheme.Medium
+agent_params.exploration.epsilon_schedule = LinearSchedule(0.5, 0.01, 3000)
+agent_params.exploration.evaluation_epsilon = 0.01
+agent_params.algorithm.discount = 1.0
+agent_params.algorithm.use_accumulated_reward_as_measurement = True
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(1)
+agent_params.algorithm.goal_vector = [1]  # accumulated_reward
+agent_params.algorithm.handling_targets_after_episode_end = HandlingTargetsAfterEpisodeEnd.LastStep
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = 'CartPole-v0'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 150
+preset_validation_params.max_episodes_to_achieve_reward = 250
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/CartPole_DQN.py b/rl_coach/presets/CartPole_DQN.py
new file mode 100644
index 0000000..22cec3b
--- /dev/null
+++ b/rl_coach/presets/CartPole_DQN.py
@@ -0,0 +1,62 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import Mujoco
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = DQNAgentParameters()
+
+# DQN params
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(100)
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(1)
+
+# NN configuration
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.network_wrappers['main'].replace_mse_with_huber_loss = False
+
+# ER size
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 40000)
+
+# E-Greedy schedule
+agent_params.exploration.epsilon_schedule = LinearSchedule(1.0, 0.01, 10000)
+
+################
+#  Environment #
+################
+env_params = Mujoco()
+env_params.level = 'CartPole-v0'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 150
+preset_validation_params.max_episodes_to_achieve_reward = 250
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/CartPole_Dueling_DDQN.py b/rl_coach/presets/CartPole_Dueling_DDQN.py
new file mode 100644
index 0000000..8ee65a2
--- /dev/null
+++ b/rl_coach/presets/CartPole_Dueling_DDQN.py
@@ -0,0 +1,67 @@
+import math
+
+from rl_coach.architectures.tensorflow_components.heads.dueling_q_head import DuelingQHeadParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import Mujoco
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.ddqn_agent import DDQNAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = DDQNAgentParameters()
+
+# DDQN params
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(100)
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(1)
+
+# NN configuration
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.network_wrappers['main'].replace_mse_with_huber_loss = False
+agent_params.network_wrappers['main'].heads_parameters = [DuelingQHeadParameters()]
+agent_params.network_wrappers['main'].rescale_gradient_from_head_by_factor = [1/math.sqrt(2), 1/math.sqrt(2)]
+
+# ER size
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 40000)
+
+# E-Greedy schedule
+agent_params.exploration.epsilon_schedule = LinearSchedule(1.0, 0.01, 10000)
+
+################
+#  Environment #
+################
+env_params = Mujoco()
+env_params.level = 'CartPole-v0'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 150
+preset_validation_params.max_episodes_to_achieve_reward = 250
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/CartPole_NEC.py b/rl_coach/presets/CartPole_NEC.py
new file mode 100644
index 0000000..7023ae3
--- /dev/null
+++ b/rl_coach/presets/CartPole_NEC.py
@@ -0,0 +1,57 @@
+from rl_coach.agents.nec_agent import NECAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import Atari, MujocoInputFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1300)
+
+#########
+# Agent #
+#########
+
+agent_params = NECAgentParameters()
+
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.exploration.epsilon_schedule = LinearSchedule(0.5, 0.1, 1000)
+agent_params.exploration.evaluation_epsilon = 0
+agent_params.algorithm.discount = 0.99
+agent_params.memory.max_size = (MemoryGranularity.Episodes, 200)
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_reward_filter('rescale', RewardRescaleFilter(1/200.))
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = 'CartPole-v0'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 150
+preset_validation_params.max_episodes_to_achieve_reward = 300
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/CartPole_NStepQ.py b/rl_coach/presets/CartPole_NStepQ.py
new file mode 100644
index 0000000..8e7e210
--- /dev/null
+++ b/rl_coach/presets/CartPole_NStepQ.py
@@ -0,0 +1,52 @@
+from rl_coach.agents.n_step_q_agent import NStepQAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import MujocoInputFilter, Mujoco
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = NStepQAgentParameters()
+
+agent_params.algorithm.discount = 0.99
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(100)
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_reward_filter('rescale', RewardRescaleFilter(1/200.))
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = 'CartPole-v0'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 150
+preset_validation_params.max_episodes_to_achieve_reward = 200
+preset_validation_params.num_workers = 8
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/CartPole_PAL.py b/rl_coach/presets/CartPole_PAL.py
new file mode 100644
index 0000000..f25843c
--- /dev/null
+++ b/rl_coach/presets/CartPole_PAL.py
@@ -0,0 +1,63 @@
+from rl_coach.agents.pal_agent import PALAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import Mujoco
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.dqn_agent import DQNAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = PALAgentParameters()
+
+# DQN params
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(100)
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(1)
+
+# NN configuration
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.network_wrappers['main'].replace_mse_with_huber_loss = False
+
+# ER size
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 40000)
+
+# E-Greedy schedule
+agent_params.exploration.epsilon_schedule = LinearSchedule(1.0, 0.01, 10000)
+
+################
+#  Environment #
+################
+env_params = Mujoco()
+env_params.level = 'CartPole-v0'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 150
+preset_validation_params.max_episodes_to_achieve_reward = 250
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/CartPole_PG.py b/rl_coach/presets/CartPole_PG.py
new file mode 100644
index 0000000..614d2eb
--- /dev/null
+++ b/rl_coach/presets/CartPole_PG.py
@@ -0,0 +1,59 @@
+from rl_coach.agents.policy_gradients_agent import PolicyGradientsAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import MujocoInputFilter, Mujoco
+from rl_coach.exploration_policies.categorical import CategoricalParameters
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = PolicyGradientsAgentParameters()
+
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.apply_gradients_every_x_episodes = 5
+agent_params.algorithm.num_steps_between_gradient_updates = 20000
+
+agent_params.network_wrappers['main'].optimizer_type = 'Adam'
+agent_params.network_wrappers['main'].learning_rate = 0.0005
+
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_reward_filter('rescale', RewardRescaleFilter(1/200.))
+
+agent_params.exploration = CategoricalParameters()
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = 'CartPole-v0'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 130
+preset_validation_params.max_episodes_to_achieve_reward = 550
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
+
diff --git a/rl_coach/presets/ControlSuite_DDPG.py b/rl_coach/presets/ControlSuite_DDPG.py
new file mode 100644
index 0000000..aaabfb7
--- /dev/null
+++ b/rl_coach/presets/ControlSuite_DDPG.py
@@ -0,0 +1,63 @@
+from rl_coach.agents.ddpg_agent import DDPGAgentParameters
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.base_parameters import VisualizationParameters, EmbedderScheme, PresetValidationParameters
+from rl_coach.environments.control_suite_environment import ControlSuiteEnvironmentParameters, control_suite_envs
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import MujocoInputFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = DDPGAgentParameters()
+agent_params.network_wrappers['actor'].input_embedders_parameters['measurements'] = \
+    agent_params.network_wrappers['actor'].input_embedders_parameters.pop('observation')
+agent_params.network_wrappers['critic'].input_embedders_parameters['measurements'] = \
+    agent_params.network_wrappers['critic'].input_embedders_parameters.pop('observation')
+agent_params.network_wrappers['actor'].input_embedders_parameters['measurements'].scheme = [Dense([300])]
+agent_params.network_wrappers['actor'].middleware_parameters.scheme = [Dense([200])]
+agent_params.network_wrappers['critic'].input_embedders_parameters['measurements'].scheme = [Dense([400])]
+agent_params.network_wrappers['critic'].middleware_parameters.scheme = [Dense([300])]
+agent_params.network_wrappers['critic'].input_embedders_parameters['action'].scheme = EmbedderScheme.Empty
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_reward_filter("rescale", RewardRescaleFilter(1/10.))
+
+###############
+# Environment #
+###############
+env_params = ControlSuiteEnvironmentParameters()
+env_params.level = SingleLevelSelection(control_suite_envs)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+
+########
+# Test #
+########
+# this preset is too slow to test on a regular basis
+
+# preset_validation_params = PresetValidationParameters()
+# preset_validation_params.test = True
+# preset_validation_params.min_reward_threshold = 150
+# preset_validation_params.max_episodes_to_achieve_reward = 250
+
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,)
+                                    # preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Doom_Basic_A3C.py b/rl_coach/presets/Doom_Basic_A3C.py
new file mode 100644
index 0000000..ac973ca
--- /dev/null
+++ b/rl_coach/presets/Doom_Basic_A3C.py
@@ -0,0 +1,64 @@
+from rl_coach.agents.actor_critic_agent import ActorCriticAgentParameters
+from rl_coach.agents.dqn_agent import DQNAgentParameters
+from rl_coach.agents.policy_optimization_agent import PolicyGradientRescaler
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.environments.gym_environment import MujocoInputFilter
+from rl_coach.exploration_policies.categorical import CategoricalParameters
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.environments.doom_environment import DoomEnvironmentParameters
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+
+#########
+# Agent #
+#########
+agent_params = ActorCriticAgentParameters()
+agent_params.algorithm.policy_gradient_rescaler = PolicyGradientRescaler.GAE
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_reward_filter('rescale', RewardRescaleFilter(1/100.))
+agent_params.algorithm.num_steps_between_gradient_updates = 30
+agent_params.algorithm.apply_gradients_every_x_episodes = 1
+agent_params.algorithm.gae_lambda = 1.0
+agent_params.algorithm.beta_entropy = 0.01
+agent_params.network_wrappers['main'].clip_gradients = 40.
+agent_params.exploration = CategoricalParameters()
+
+###############
+# Environment #
+###############
+env_params = DoomEnvironmentParameters()
+env_params.level = 'basic'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 20
+preset_validation_params.max_episodes_to_achieve_reward = 400
+
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Doom_Basic_BC.py b/rl_coach/presets/Doom_Basic_BC.py
new file mode 100644
index 0000000..aaa2f95
--- /dev/null
+++ b/rl_coach/presets/Doom_Basic_BC.py
@@ -0,0 +1,43 @@
+from rl_coach.agents.bc_agent import BCAgentParameters
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps
+from rl_coach.environments.doom_environment import DoomEnvironmentParameters
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = TrainingSteps(500)
+schedule_params.evaluation_steps = EnvironmentEpisodes(5)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+
+#########
+# Agent #
+#########
+agent_params = BCAgentParameters()
+# agent_params.memory.max_size = (MemoryGranularity.Episodes, 1000)
+agent_params.network_wrappers['main'].learning_rate = 0.0005
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(1000)
+agent_params.exploration.epsilon_schedule = LinearSchedule(0, 0, 50000)
+agent_params.exploration.evaluation_epsilon = 0
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(0)
+agent_params.network_wrappers['main'].replace_mse_with_huber_loss = False
+agent_params.network_wrappers['main'].batch_size = 120
+agent_params.memory.load_memory_from_file_path = 'datasets/doom_basic.p'
+
+
+###############
+# Environment #
+###############
+env_params = DoomEnvironmentParameters()
+env_params.level = 'basic'
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=VisualizationParameters())
diff --git a/rl_coach/presets/Doom_Basic_DFP.py b/rl_coach/presets/Doom_Basic_DFP.py
new file mode 100644
index 0000000..c1e4211
--- /dev/null
+++ b/rl_coach/presets/Doom_Basic_DFP.py
@@ -0,0 +1,51 @@
+from rl_coach.agents.dfp_agent import DFPAgentParameters, HandlingTargetsAfterEpisodeEnd
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.environments.doom_environment import DoomEnvironmentParameters
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(50)
+schedule_params.evaluation_steps = EnvironmentEpisodes(3)
+
+# There is no heatup for DFP. heatup length is determined according to batch size. See below.
+
+#########
+# Agent #
+#########
+agent_params = DFPAgentParameters()
+schedule_params.heatup_steps = EnvironmentSteps(agent_params.network_wrappers['main'].batch_size)
+
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.exploration.epsilon_schedule = LinearSchedule(0.5, 0, 10000)
+agent_params.exploration.evaluation_epsilon = 0
+
+# this works better than the default which is 64
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(1)
+
+agent_params.algorithm.use_accumulated_reward_as_measurement = True
+agent_params.algorithm.goal_vector = [0, 1]  # ammo, accumulated_reward
+agent_params.algorithm.handling_targets_after_episode_end = HandlingTargetsAfterEpisodeEnd.LastStep
+
+
+###############
+# Environment #
+###############
+env_params = DoomEnvironmentParameters()
+env_params.level = 'basic'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/Doom_Basic_DQN.py b/rl_coach/presets/Doom_Basic_DQN.py
new file mode 100644
index 0000000..8f89ab4
--- /dev/null
+++ b/rl_coach/presets/Doom_Basic_DQN.py
@@ -0,0 +1,57 @@
+from rl_coach.agents.dqn_agent import DQNAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.environments.doom_environment import DoomEnvironmentParameters
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+
+#########
+# Agent #
+#########
+agent_params = DQNAgentParameters()
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 5000)
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(1000)
+agent_params.exploration.epsilon_schedule = LinearSchedule(0, 0, 50000)
+agent_params.exploration.evaluation_epsilon = 0
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(1)
+agent_params.network_wrappers['main'].replace_mse_with_huber_loss = False
+
+
+###############
+# Environment #
+###############
+env_params = DoomEnvironmentParameters()
+env_params.level = 'basic'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 20
+preset_validation_params.max_episodes_to_achieve_reward = 400
+
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Doom_Basic_Dueling_DDQN.py b/rl_coach/presets/Doom_Basic_Dueling_DDQN.py
new file mode 100644
index 0000000..2771715
--- /dev/null
+++ b/rl_coach/presets/Doom_Basic_Dueling_DDQN.py
@@ -0,0 +1,48 @@
+from rl_coach.agents.ddqn_agent import DDQNAgentParameters
+from rl_coach.architectures.tensorflow_components.heads.dueling_q_head import DuelingQHeadParameters
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.environments.doom_environment import DoomEnvironmentParameters
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(50)
+schedule_params.evaluation_steps = EnvironmentEpisodes(3)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+
+#########
+# Agent #
+#########
+agent_params = DDQNAgentParameters()
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 5000)
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(1000)
+agent_params.exploration.epsilon_schedule = LinearSchedule(0.5, 0.01, 50000)
+agent_params.exploration.evaluation_epsilon = 0
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(1)
+agent_params.network_wrappers['main'].replace_mse_with_huber_loss = False
+agent_params.network_wrappers['main'].heads_parameters = [DuelingQHeadParameters()]
+
+###############
+# Environment #
+###############
+env_params = DoomEnvironmentParameters()
+env_params.level = 'basic'
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/Doom_Battle_DFP.py b/rl_coach/presets/Doom_Battle_DFP.py
new file mode 100644
index 0000000..58fd76d
--- /dev/null
+++ b/rl_coach/presets/Doom_Battle_DFP.py
@@ -0,0 +1,57 @@
+from rl_coach.agents.dfp_agent import DFPAgentParameters
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentSteps, RunPhase
+from rl_coach.environments.doom_environment import DoomEnvironmentParameters, DoomEnvironment
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(6250000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(62500)
+schedule_params.evaluation_steps = EnvironmentSteps(6250)
+schedule_params.heatup_steps = EnvironmentSteps(1)
+
+#########
+# Agent #
+#########
+agent_params = DFPAgentParameters()
+
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+# the original DFP code decays  epsilon in ~1.5M steps. Only that unlike other most other papers, these are 1.5M
+# training steps. i.e. it is equivalent to once every 8 playing steps (when a training batch is sampled).
+# so this is 1.5M*8 =~ 12M playing steps per worker.
+# TODO allow the epsilon schedule to be defined in terms of training steps.
+agent_params.exploration.epsilon_schedule = LinearSchedule(1, 0, 12000000)
+agent_params.exploration.evaluation_epsilon = 0
+agent_params.algorithm.use_accumulated_reward_as_measurement = False
+agent_params.algorithm.goal_vector = [0.5, 0.5, 1]  # ammo, health, frag count
+agent_params.network_wrappers['main'].input_embedders_parameters['measurements'].input_rescaling['vector'] = 100.
+agent_params.algorithm.scale_measurements_targets['GameVariable.HEALTH'] = 30.0
+agent_params.algorithm.scale_measurements_targets['GameVariable.AMMO2'] = 7.5
+agent_params.algorithm.scale_measurements_targets['GameVariable.USER2'] = 1.0
+agent_params.network_wrappers['main'].learning_rate_decay_rate = 0.3
+agent_params.network_wrappers['main'].learning_rate_decay_steps = 250000
+agent_params.network_wrappers['main'].input_embedders_parameters['measurements'].input_offset['vector'] = 0.5
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].input_offset['vector'] = 0.5
+
+
+###############
+# Environment #
+###############
+env_params = DoomEnvironmentParameters()
+env_params.level = 'BATTLE_COACH_LOCAL'
+env_params.cameras = [DoomEnvironment.CameraTypes.OBSERVATION]
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/Doom_Health_DFP.py b/rl_coach/presets/Doom_Health_DFP.py
new file mode 100644
index 0000000..c4d9a8f
--- /dev/null
+++ b/rl_coach/presets/Doom_Health_DFP.py
@@ -0,0 +1,68 @@
+from rl_coach.agents.dfp_agent import DFPAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, EmbedderScheme, MiddlewareScheme, \
+    PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.environments.doom_environment import DoomEnvironmentParameters
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(6250000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(62500)
+schedule_params.evaluation_steps = EnvironmentSteps(6250)
+
+# There is no heatup for DFP. heatup length is determined according to batch size. See below.
+
+#########
+# Agent #
+#########
+agent_params = DFPAgentParameters()
+schedule_params.heatup_steps = EnvironmentSteps(agent_params.network_wrappers['main'].batch_size)
+
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.exploration.epsilon_schedule = LinearSchedule(0.5, 0, 10000)
+agent_params.exploration.evaluation_epsilon = 0
+agent_params.algorithm.goal_vector = [1]  # health
+
+
+# scale observation and measurements to be -0.5 <-> 0.5
+agent_params.network_wrappers['main'].input_embedders_parameters['measurements'].input_rescaling['vector'] = 100.
+agent_params.network_wrappers['main'].input_embedders_parameters['measurements'].input_offset['vector'] = 0.5
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].input_offset['vector'] = 0.5
+
+# changing the network scheme to match Coach's default network, as it performs better on this preset
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].scheme = EmbedderScheme.Medium
+agent_params.network_wrappers['main'].input_embedders_parameters['measurements'].scheme = EmbedderScheme.Medium
+agent_params.network_wrappers['main'].input_embedders_parameters['goal'].scheme = EmbedderScheme.Medium
+agent_params.network_wrappers['main'].middleware_parameters.scheme = MiddlewareScheme.Medium
+
+# scale the target measurements according to the paper (dividing by standard deviation)
+agent_params.algorithm.scale_measurements_targets['GameVariable.HEALTH'] = 30.0
+
+###############
+# Environment #
+###############
+env_params = DoomEnvironmentParameters()
+env_params.level = 'HEALTH_GATHERING'
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 1600
+preset_validation_params.max_episodes_to_achieve_reward = 70
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Doom_Health_MMC.py b/rl_coach/presets/Doom_Health_MMC.py
new file mode 100644
index 0000000..2aa6737
--- /dev/null
+++ b/rl_coach/presets/Doom_Health_MMC.py
@@ -0,0 +1,59 @@
+from rl_coach.agents.dfp_agent import DFPAgentParameters
+from rl_coach.agents.mmc_agent import MixedMonteCarloAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, EmbedderScheme, MiddlewareScheme, \
+    PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.environments.doom_environment import DoomEnvironmentParameters
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(5)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = MixedMonteCarloAgentParameters()
+
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.exploration.epsilon_schedule = LinearSchedule(0.5, 0, 10000)
+agent_params.exploration.evaluation_epsilon = 0
+agent_params.memory.max_size = (MemoryGranularity.Episodes, 200)
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(1000)
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(1)
+agent_params.network_wrappers['main'].replace_mse_with_huber_loss = False
+
+
+###############
+# Environment #
+###############
+env_params = DoomEnvironmentParameters()
+env_params.level = 'HEALTH_GATHERING'
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+
+# disabling this test for now, as it takes too long to converge
+# preset_validation_params.test = True
+# preset_validation_params.min_reward_threshold = 1000
+# preset_validation_params.max_episodes_to_achieve_reward = 300
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Doom_Health_Supreme_DFP.py b/rl_coach/presets/Doom_Health_Supreme_DFP.py
new file mode 100644
index 0000000..f43da45
--- /dev/null
+++ b/rl_coach/presets/Doom_Health_Supreme_DFP.py
@@ -0,0 +1,67 @@
+from rl_coach.agents.dfp_agent import DFPAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, EmbedderScheme, MiddlewareScheme, \
+    PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.environments.doom_environment import DoomEnvironmentParameters
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(6250000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(62500)
+schedule_params.evaluation_steps = EnvironmentSteps(6250)
+
+# There is no heatup for DFP. heatup length is determined according to batch size. See below.
+
+#########
+# Agent #
+#########
+agent_params = DFPAgentParameters()
+schedule_params.heatup_steps = EnvironmentSteps(agent_params.network_wrappers['main'].batch_size)
+
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.exploration.epsilon_schedule = LinearSchedule(0.5, 0, 10000)
+agent_params.exploration.evaluation_epsilon = 0
+agent_params.algorithm.goal_vector = [1]  # health
+
+# scale observation and measurements to be -0.5 <-> 0.5
+agent_params.network_wrappers['main'].input_embedders_parameters['measurements'].input_rescaling['vector'] = 100.
+agent_params.network_wrappers['main'].input_embedders_parameters['measurements'].input_offset['vector'] = 0.5
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].input_offset['vector'] = 0.5
+
+# changing the network scheme to match Coach's default network, as it performs better on this preset
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].scheme = EmbedderScheme.Medium
+agent_params.network_wrappers['main'].input_embedders_parameters['measurements'].scheme = EmbedderScheme.Medium
+agent_params.network_wrappers['main'].input_embedders_parameters['goal'].scheme = EmbedderScheme.Medium
+agent_params.network_wrappers['main'].middleware_parameters.scheme = MiddlewareScheme.Medium
+
+# scale the target measurements according to the paper (dividing by standard deviation)
+agent_params.algorithm.scale_measurements_targets['GameVariable.HEALTH'] = 30.0
+
+###############
+# Environment #
+###############
+env_params = DoomEnvironmentParameters()
+env_params.level = 'HEALTH_GATHERING_SUPREME_COACH_LOCAL'
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 1600
+preset_validation_params.max_episodes_to_achieve_reward = 70
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/ExplorationChain_Bootstrapped_DQN.py b/rl_coach/presets/ExplorationChain_Bootstrapped_DQN.py
new file mode 100644
index 0000000..4dc63dc
--- /dev/null
+++ b/rl_coach/presets/ExplorationChain_Bootstrapped_DQN.py
@@ -0,0 +1,66 @@
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.gym_environment import Mujoco
+from rl_coach.filters.filter import NoInputFilter, NoOutputFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import ConstantSchedule
+
+from rl_coach.agents.bootstrapped_dqn_agent import BootstrappedDQNAgentParameters
+from rl_coach.core_types import EnvironmentEpisodes, EnvironmentSteps
+
+N = 20
+num_output_head_copies = 20
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentEpisodes(2000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(N)
+
+####################
+# DQN Agent Params #
+####################
+agent_params = BootstrappedDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 1000000)
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(4)
+agent_params.network_wrappers['main'].num_output_head_copies = num_output_head_copies
+agent_params.network_wrappers['main'].rescale_gradient_from_head_by_factor = [1.0/num_output_head_copies]*num_output_head_copies
+agent_params.exploration.bootstrapped_data_sharing_probability = 1.0
+agent_params.exploration.architecture_num_q_heads = num_output_head_copies
+agent_params.exploration.epsilon_schedule = ConstantSchedule(0)
+agent_params.input_filter = NoInputFilter()
+agent_params.output_filter = NoOutputFilter()
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = 'rl_coach.environments.toy_problems.exploration_chain:ExplorationChain'
+
+env_params.additional_simulator_parameters = {'chain_length': N, 'max_steps': N+7}
+
+vis_params = VisualizationParameters()
+
+
+########
+# Test #
+########
+
+# currently no test here as bootstrapped_dqn seems to be broken
+
+# preset_validation_params = PresetValidationParameters()
+# preset_validation_params.test = True
+# preset_validation_params.min_reward_threshold = 1600
+# preset_validation_params.max_episodes_to_achieve_reward = 70
+
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,)
+                                    # preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/ExplorationChain_Dueling_DDQN.py b/rl_coach/presets/ExplorationChain_Dueling_DDQN.py
new file mode 100644
index 0000000..021edef
--- /dev/null
+++ b/rl_coach/presets/ExplorationChain_Dueling_DDQN.py
@@ -0,0 +1,56 @@
+from rl_coach.architectures.tensorflow_components.heads.dueling_q_head import DuelingQHeadParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.gym_environment import GymEnvironmentParameters
+from rl_coach.filters.filter import NoInputFilter, NoOutputFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.ddqn_agent import DDQNAgentParameters
+from rl_coach.core_types import EnvironmentEpisodes, EnvironmentSteps
+
+N = 20
+num_output_head_copies = 20
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentEpisodes(2000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(N)
+
+####################
+# DQN Agent Params #
+####################
+agent_params = DDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.network_wrappers['main'].heads_parameters = [DuelingQHeadParameters()]
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 1000000)
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(4)
+agent_params.exploration.epsilon_schedule = LinearSchedule(1, 0.1, (N+7)*2000)
+agent_params.input_filter = NoInputFilter()
+agent_params.output_filter = NoOutputFilter()
+
+###############
+# Environment #
+###############
+env_params = GymEnvironmentParameters()
+env_params.level = 'rl_coach.environments.toy_problems.exploration_chain:ExplorationChain'
+env_params.additional_simulator_parameters = {'chain_length': N, 'max_steps': N+7}
+
+vis_params = VisualizationParameters()
+
+
+# preset_validation_params = PresetValidationParameters()
+# preset_validation_params.test = True
+# preset_validation_params.min_reward_threshold = 1600
+# preset_validation_params.max_episodes_to_achieve_reward = 70
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,)
+                                    # preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/ExplorationChain_UCB_Q_ensembles.py b/rl_coach/presets/ExplorationChain_UCB_Q_ensembles.py
new file mode 100644
index 0000000..e4ac446
--- /dev/null
+++ b/rl_coach/presets/ExplorationChain_UCB_Q_ensembles.py
@@ -0,0 +1,55 @@
+from rl_coach.agents.bootstrapped_dqn_agent import BootstrappedDQNAgentParameters
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.gym_environment import Mujoco
+from rl_coach.filters.filter import NoInputFilter, NoOutputFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import ConstantSchedule
+
+from rl_coach.core_types import EnvironmentEpisodes, EnvironmentSteps
+from rl_coach.exploration_policies.ucb import UCBParameters
+
+N = 20
+num_output_head_copies = 20
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentEpisodes(2000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(10)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(N)
+
+####################
+# DQN Agent Params #
+####################
+agent_params = BootstrappedDQNAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 1000000)
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(4)
+agent_params.network_wrappers['main'].num_output_head_copies = num_output_head_copies
+agent_params.network_wrappers['main'].rescale_gradient_from_head_by_factor = [1.0/num_output_head_copies]*num_output_head_copies
+agent_params.exploration = UCBParameters()
+agent_params.exploration.bootstrapped_data_sharing_probability = 1.0
+agent_params.exploration.architecture_num_q_heads = num_output_head_copies
+agent_params.exploration.epsilon_schedule = ConstantSchedule(0)
+agent_params.exploration.lamb = 10
+agent_params.input_filter = NoInputFilter()
+agent_params.output_filter = NoOutputFilter()
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = 'rl_coach.environments.toy_problems.exploration_chain:ExplorationChain'
+
+env_params.additional_simulator_parameters = {'chain_length': N, 'max_steps': N+7}
+
+vis_params = VisualizationParameters()
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/Fetch_DDPG_HER_baselines.py b/rl_coach/presets/Fetch_DDPG_HER_baselines.py
new file mode 100644
index 0000000..09f17ac
--- /dev/null
+++ b/rl_coach/presets/Fetch_DDPG_HER_baselines.py
@@ -0,0 +1,130 @@
+from rl_coach.agents.ddpg_agent import DDPGAgentParameters
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.architectures.tensorflow_components.middlewares.fc_middleware import FCMiddlewareParameters
+from rl_coach.base_parameters import VisualizationParameters, EmbedderScheme, InputEmbedderParameters, PresetValidationParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod, MaxDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Mujoco, MujocoInputFilter, fetch_v1
+from rl_coach.filters.observation.observation_clipping_filter import ObservationClippingFilter
+from rl_coach.filters.observation.observation_normalization_filter import ObservationNormalizationFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.episodic.episodic_hindsight_experience_replay import EpisodicHindsightExperienceReplayParameters, \
+    HindsightGoalSelectionMethod
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import ConstantSchedule
+from rl_coach.spaces import GoalsSpace, ReachingGoal
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.exploration_policies.e_greedy import EGreedyParameters
+
+cycles = 100  # 20 for reach. for others it's 100
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentEpisodes(cycles * 200)  # 200 epochs
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(cycles)  # 50 cycles
+schedule_params.evaluation_steps = EnvironmentEpisodes(10)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+################
+# Agent Params #
+################
+agent_params = DDPGAgentParameters()
+
+# actor
+actor_network = agent_params.network_wrappers['actor']
+actor_network.learning_rate = 0.001
+actor_network.batch_size = 256
+actor_network.optimizer_epsilon = 1e-08
+actor_network.adam_optimizer_beta1 = 0.9
+actor_network.adam_optimizer_beta2 = 0.999
+actor_network.input_embedders_parameters = {
+    'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+    'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)
+}
+actor_network.middleware_parameters = FCMiddlewareParameters(scheme=[Dense([256]), Dense([256]), Dense([256])])
+actor_network.heads_parameters[0].batchnorm = False
+
+# critic
+critic_network = agent_params.network_wrappers['critic']
+critic_network.learning_rate = 0.001
+critic_network.batch_size = 256
+critic_network.optimizer_epsilon = 1e-08
+critic_network.adam_optimizer_beta1 = 0.9
+critic_network.adam_optimizer_beta2 = 0.999
+critic_network.input_embedders_parameters = {
+    'action': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+    'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+    'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty)
+}
+critic_network.middleware_parameters = FCMiddlewareParameters(scheme=[Dense([256]), Dense([256]), Dense([256])])
+
+agent_params.algorithm.discount = 0.98
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentEpisodes(1)
+agent_params.algorithm.num_consecutive_training_steps = 40
+agent_params.algorithm.num_steps_between_copying_online_weights_to_target = TrainingSteps(40)
+agent_params.algorithm.rate_for_copying_weights_to_target = 0.05
+agent_params.algorithm.action_penalty = 1
+agent_params.algorithm.use_non_zero_discount_for_terminal_states = True
+agent_params.algorithm.clip_critic_targets = [-50, 0]
+
+# HER parameters
+agent_params.memory = EpisodicHindsightExperienceReplayParameters()
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 10**6)
+agent_params.memory.hindsight_goal_selection_method = HindsightGoalSelectionMethod.Future
+agent_params.memory.hindsight_transitions_per_regular_transition = 4
+agent_params.memory.goals_space = GoalsSpace(goal_name='achieved_goal',
+                                             reward_type=ReachingGoal(distance_from_goal_threshold=0.05,
+                                                                      goal_reaching_reward=0,
+                                                                      default_reward=-1),
+                                             distance_metric=GoalsSpace.DistanceMetric.Euclidean)
+agent_params.memory.shared_memory = True
+
+# exploration parameters
+agent_params.exploration = EGreedyParameters()
+agent_params.exploration.epsilon_schedule = ConstantSchedule(0.3)
+agent_params.exploration.evaluation_epsilon = 0
+# they actually take the noise_percentage_schedule to be 0.2 * max_abs_range which is 0.1 * total_range
+agent_params.exploration.continuous_exploration_policy_parameters.noise_percentage_schedule = ConstantSchedule(0.1)
+agent_params.exploration.continuous_exploration_policy_parameters.evaluation_noise_percentage = 0
+
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_observation_filter('observation', 'clipping', ObservationClippingFilter(-200, 200))
+
+agent_params.pre_network_filter = MujocoInputFilter()
+agent_params.pre_network_filter.add_observation_filter('observation', 'normalize_observation',
+                                                       ObservationNormalizationFilter(name='normalize_observation'))
+agent_params.pre_network_filter.add_observation_filter('achieved_goal', 'normalize_achieved_goal',
+                                                       ObservationNormalizationFilter(name='normalize_achieved_goal'))
+agent_params.pre_network_filter.add_observation_filter('desired_goal', 'normalize_desired_goal',
+                                                       ObservationNormalizationFilter(name='normalize_desired_goal'))
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = SingleLevelSelection(fetch_v1)
+env_params.custom_reward_threshold = -49
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+# preset_validation_params.test = True
+# preset_validation_params.min_reward_threshold = 200
+# preset_validation_params.max_episodes_to_achieve_reward = 600
+# preset_validation_params.reward_test_level = 'inverted_pendulum'
+preset_validation_params.trace_test_levels = ['slide', 'pick_and_place', 'push', 'reach']
+
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
+
diff --git a/rl_coach/presets/InvertedPendulum_PG.py b/rl_coach/presets/InvertedPendulum_PG.py
new file mode 100644
index 0000000..144513d
--- /dev/null
+++ b/rl_coach/presets/InvertedPendulum_PG.py
@@ -0,0 +1,47 @@
+from rl_coach.agents.policy_gradients_agent import PolicyGradientsAgentParameters
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod
+from rl_coach.environments.gym_environment import Mujoco, MujocoInputFilter
+from rl_coach.filters.observation.observation_normalization_filter import ObservationNormalizationFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(50)
+schedule_params.evaluation_steps = EnvironmentEpisodes(3)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = PolicyGradientsAgentParameters()
+agent_params.algorithm.apply_gradients_every_x_episodes = 5
+agent_params.algorithm.num_steps_between_gradient_updates = 20000
+agent_params.network_wrappers['main'].learning_rate = 0.0005
+
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_reward_filter('rescale', RewardRescaleFilter(1/20.))
+agent_params.input_filter.add_observation_filter('observation', 'normalize', ObservationNormalizationFilter())
+
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = "InvertedPendulum-v2"
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
+
+
diff --git a/rl_coach/presets/MontezumaRevenge_BC.py b/rl_coach/presets/MontezumaRevenge_BC.py
new file mode 100644
index 0000000..056849c
--- /dev/null
+++ b/rl_coach/presets/MontezumaRevenge_BC.py
@@ -0,0 +1,44 @@
+from rl_coach.base_parameters import VisualizationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod
+from rl_coach.environments.gym_environment import Atari
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+
+from rl_coach.agents.bc_agent import BCAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = TrainingSteps(500)
+schedule_params.evaluation_steps = EnvironmentEpisodes(5)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = BCAgentParameters()
+agent_params.network_wrappers['main'].learning_rate = 0.00025
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 1000000)
+# agent_params.memory.discount = 0.99
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(0)
+agent_params.memory.load_memory_from_file_path = 'datasets/montezuma_revenge.p'
+
+###############
+# Environment #
+###############
+env_params = Atari()
+env_params.level = 'MontezumaRevenge-v0'
+env_params.random_initialization_steps = 30
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/Mujoco_A3C.py b/rl_coach/presets/Mujoco_A3C.py
new file mode 100644
index 0000000..86287f9
--- /dev/null
+++ b/rl_coach/presets/Mujoco_A3C.py
@@ -0,0 +1,62 @@
+from rl_coach.agents.actor_critic_agent import ActorCriticAgentParameters
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Mujoco, mujoco_v2, MujocoInputFilter
+from rl_coach.exploration_policies.continuous_entropy import ContinuousEntropyParameters
+from rl_coach.filters.observation.observation_normalization_filter import ObservationNormalizationFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(20000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = ActorCriticAgentParameters()
+agent_params.algorithm.apply_gradients_every_x_episodes = 1
+agent_params.algorithm.num_steps_between_gradient_updates = 10000000
+agent_params.algorithm.beta_entropy = 0.0001
+agent_params.network_wrappers['main'].learning_rate = 0.00001
+
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_reward_filter('rescale', RewardRescaleFilter(1/20.))
+agent_params.input_filter.add_observation_filter('observation', 'normalize', ObservationNormalizationFilter())
+
+agent_params.exploration = ContinuousEntropyParameters()
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = SingleLevelSelection(mujoco_v2)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 400
+preset_validation_params.max_episodes_to_achieve_reward = 1000
+preset_validation_params.num_workers = 8
+preset_validation_params.reward_test_level = 'inverted_pendulum'
+preset_validation_params.trace_test_levels = ['inverted_pendulum', 'hopper']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
+
+
diff --git a/rl_coach/presets/Mujoco_A3C_LSTM.py b/rl_coach/presets/Mujoco_A3C_LSTM.py
new file mode 100644
index 0000000..0df4181
--- /dev/null
+++ b/rl_coach/presets/Mujoco_A3C_LSTM.py
@@ -0,0 +1,68 @@
+from rl_coach.agents.actor_critic_agent import ActorCriticAgentParameters
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.architectures.tensorflow_components.middlewares.lstm_middleware import LSTMMiddlewareParameters
+from rl_coach.base_parameters import VisualizationParameters, InputEmbedderParameters, MiddlewareScheme, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Mujoco, mujoco_v2, MujocoInputFilter
+from rl_coach.exploration_policies.continuous_entropy import ContinuousEntropyParameters
+from rl_coach.filters.observation.observation_normalization_filter import ObservationNormalizationFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = ActorCriticAgentParameters()
+agent_params.algorithm.apply_gradients_every_x_episodes = 1
+agent_params.algorithm.num_steps_between_gradient_updates = 20
+agent_params.algorithm.beta_entropy = 0.005
+agent_params.network_wrappers['main'].learning_rate = 0.00002
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'] = \
+    InputEmbedderParameters(scheme=[Dense([200])])
+agent_params.network_wrappers['main'].middleware_parameters = LSTMMiddlewareParameters(scheme=MiddlewareScheme.Empty,
+                                                                                       number_of_lstm_cells=128)
+
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_reward_filter('rescale', RewardRescaleFilter(1/20.))
+agent_params.input_filter.add_observation_filter('observation', 'normalize', ObservationNormalizationFilter())
+
+agent_params.exploration = ContinuousEntropyParameters()
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = SingleLevelSelection(mujoco_v2)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 400
+preset_validation_params.max_episodes_to_achieve_reward = 1000
+preset_validation_params.num_workers = 8
+preset_validation_params.reward_test_level = 'inverted_pendulum'
+preset_validation_params.trace_test_levels = ['inverted_pendulum', 'hopper']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
+
+
diff --git a/rl_coach/presets/Mujoco_ClippedPPO.py b/rl_coach/presets/Mujoco_ClippedPPO.py
new file mode 100644
index 0000000..d4f2bdc
--- /dev/null
+++ b/rl_coach/presets/Mujoco_ClippedPPO.py
@@ -0,0 +1,78 @@
+from rl_coach.exploration_policies .additive_noise import AdditiveNoiseParameters
+
+from rl_coach.agents.clipped_ppo_agent import ClippedPPOAgentParameters
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Mujoco, mujoco_v2, MujocoInputFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.filters.observation.observation_normalization_filter import ObservationNormalizationFilter
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(2048)
+schedule_params.evaluation_steps = EnvironmentEpisodes(5)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = ClippedPPOAgentParameters()
+
+
+agent_params.network_wrappers['main'].learning_rate = 0.0003
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].activation_function = 'tanh'
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].scheme = [Dense([64])]
+agent_params.network_wrappers['main'].middleware_parameters.scheme = [Dense([64])]
+agent_params.network_wrappers['main'].middleware_parameters.activation_function = 'tanh'
+agent_params.network_wrappers['main'].batch_size = 64
+agent_params.network_wrappers['main'].optimizer_epsilon = 1e-5
+agent_params.network_wrappers['main'].adam_optimizer_beta2 = 0.999
+
+agent_params.algorithm.clip_likelihood_ratio_using_epsilon = 0.2
+agent_params.algorithm.clipping_decay_schedule = LinearSchedule(1.0, 0, 1000000)
+agent_params.algorithm.beta_entropy = 0
+agent_params.algorithm.gae_lambda = 0.95
+agent_params.algorithm.discount = 0.99
+agent_params.algorithm.optimization_epochs = 10
+agent_params.algorithm.estimate_state_value_using_gae = True
+
+agent_params.input_filter = MujocoInputFilter()
+agent_params.exploration = AdditiveNoiseParameters()
+agent_params.pre_network_filter = MujocoInputFilter()
+agent_params.pre_network_filter.add_observation_filter('observation', 'normalize_observation',
+                                                       ObservationNormalizationFilter(name='normalize_observation'))
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = SingleLevelSelection(mujoco_v2)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 400
+preset_validation_params.max_episodes_to_achieve_reward = 1000
+preset_validation_params.reward_test_level = 'inverted_pendulum'
+preset_validation_params.trace_test_levels = ['inverted_pendulum', 'hopper']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
+
+
diff --git a/rl_coach/presets/Mujoco_DDPG.py b/rl_coach/presets/Mujoco_DDPG.py
new file mode 100644
index 0000000..1703910
--- /dev/null
+++ b/rl_coach/presets/Mujoco_DDPG.py
@@ -0,0 +1,53 @@
+from rl_coach.agents.ddpg_agent import DDPGAgentParameters
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters, EmbedderScheme
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Mujoco, mujoco_v2
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import EnvironmentEpisodes, EnvironmentSteps, RunPhase
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentSteps(2000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = DDPGAgentParameters()
+agent_params.network_wrappers['actor'].input_embedders_parameters['observation'].scheme = [Dense([400])]
+agent_params.network_wrappers['actor'].middleware_parameters.scheme = [Dense([300])]
+agent_params.network_wrappers['critic'].input_embedders_parameters['observation'].scheme = [Dense([400])]
+agent_params.network_wrappers['critic'].middleware_parameters.scheme = [Dense([300])]
+agent_params.network_wrappers['critic'].input_embedders_parameters['action'].scheme = EmbedderScheme.Empty
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = SingleLevelSelection(mujoco_v2)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 400
+preset_validation_params.max_episodes_to_achieve_reward = 1000
+preset_validation_params.reward_test_level = 'inverted_pendulum'
+preset_validation_params.trace_test_levels = ['inverted_pendulum', 'hopper']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Mujoco_NAF.py b/rl_coach/presets/Mujoco_NAF.py
new file mode 100644
index 0000000..d725b8b
--- /dev/null
+++ b/rl_coach/presets/Mujoco_NAF.py
@@ -0,0 +1,56 @@
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Mujoco, mujoco_v2
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.agents.naf_agent import NAFAgentParameters
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase, GradientClippingMethod
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(1000)
+
+#########
+# Agent #
+#########
+agent_params = NAFAgentParameters()
+agent_params.network_wrappers['main'].input_embedders_parameters['observation'].scheme = [Dense([200])]
+agent_params.network_wrappers['main'].middleware_parameters.scheme = [Dense([200])]
+agent_params.network_wrappers['main'].clip_gradients = 1000
+agent_params.network_wrappers['main'].gradients_clipping_method = GradientClippingMethod.ClipByValue
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = SingleLevelSelection(mujoco_v2)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+
+# this preset is currently broken - no test
+
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+# preset_validation_params.test = True
+# preset_validation_params.min_reward_threshold = 200
+# preset_validation_params.max_episodes_to_achieve_reward = 600
+# preset_validation_params.reward_test_level = 'inverted_pendulum'
+preset_validation_params.trace_test_levels = ['inverted_pendulum', 'hopper']
+
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
diff --git a/rl_coach/presets/Mujoco_PPO.py b/rl_coach/presets/Mujoco_PPO.py
new file mode 100644
index 0000000..983e470
--- /dev/null
+++ b/rl_coach/presets/Mujoco_PPO.py
@@ -0,0 +1,67 @@
+from rl_coach.agents.ppo_agent import PPOAgentParameters
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.base_parameters import VisualizationParameters, PresetValidationParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection
+from rl_coach.environments.gym_environment import Mujoco, mujoco_v2, MujocoInputFilter
+from rl_coach.filters.observation.observation_normalization_filter import ObservationNormalizationFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase
+from rl_coach.exploration_policies.continuous_entropy import ContinuousEntropyParameters
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentSteps(2000)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = PPOAgentParameters()
+agent_params.network_wrappers['actor'].learning_rate = 0.001
+agent_params.network_wrappers['critic'].learning_rate = 0.001
+
+agent_params.network_wrappers['actor'].input_embedders_parameters['observation'].scheme = [Dense([64])]
+agent_params.network_wrappers['actor'].middleware_parameters.scheme = [Dense([64])]
+agent_params.network_wrappers['critic'].input_embedders_parameters['observation'].scheme = [Dense([64])]
+agent_params.network_wrappers['critic'].middleware_parameters.scheme = [Dense([64])]
+
+agent_params.input_filter = MujocoInputFilter()
+agent_params.input_filter.add_observation_filter('observation', 'normalize', ObservationNormalizationFilter())
+
+agent_params.exploration = ContinuousEntropyParameters()
+
+###############
+# Environment #
+###############
+env_params = Mujoco()
+env_params.level = SingleLevelSelection(mujoco_v2)
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+
+
+# this preset is currently broken
+
+
+########
+# Test #
+########
+preset_validation_params = PresetValidationParameters()
+preset_validation_params.test = True
+preset_validation_params.min_reward_threshold = 400
+preset_validation_params.max_episodes_to_achieve_reward = 3000
+preset_validation_params.reward_test_level = 'inverted_pendulum'
+preset_validation_params.trace_test_levels = ['inverted_pendulum', 'hopper']
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params,
+                                    preset_validation_params=preset_validation_params)
+
+
diff --git a/rl_coach/presets/Pendulum_HAC.py b/rl_coach/presets/Pendulum_HAC.py
new file mode 100644
index 0000000..265c294
--- /dev/null
+++ b/rl_coach/presets/Pendulum_HAC.py
@@ -0,0 +1,148 @@
+import numpy as np
+from rl_coach.agents.ddpg_agent import DDPGAgentParameters
+from rl_coach.agents.hac_ddpg_agent import HACDDPGAgentParameters
+from rl_coach.architectures.tensorflow_components.architecture import Dense
+from rl_coach.base_parameters import VisualizationParameters, EmbeddingMergerType, EmbedderScheme, \
+    InputEmbedderParameters
+from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod
+from rl_coach.environments.gym_environment import Mujoco
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.graph_managers.hac_graph_manager import HACGraphManager
+from rl_coach.memories.episodic.episodic_hindsight_experience_replay import HindsightGoalSelectionMethod, \
+    EpisodicHindsightExperienceReplayParameters
+from rl_coach.memories.episodic.episodic_hrl_hindsight_experience_replay import \
+    EpisodicHRLHindsightExperienceReplayParameters
+from rl_coach.memories.memory import MemoryGranularity
+
+from rl_coach.schedules import ConstantSchedule
+from rl_coach.spaces import GoalsSpace, ReachingGoal
+
+from rl_coach.core_types import EnvironmentEpisodes, EnvironmentSteps, RunPhase, TrainingSteps
+from rl_coach.exploration_policies.e_greedy import EGreedyParameters
+from rl_coach.exploration_policies.ou_process import OUProcessParameters
+
+####################
+# Graph Scheduling #
+####################
+
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = EnvironmentEpisodes(40 * 4 * 64)  # 40 epochs
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(4 * 64)  # 4 small batches of 64 episodes
+schedule_params.evaluation_steps = EnvironmentEpisodes(64)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+
+polar_coordinates = False
+
+#########
+# Agent #
+#########
+
+if polar_coordinates:
+    distance_from_goal_threshold = np.array([0.075, 0.75])
+else:
+    distance_from_goal_threshold = np.array([0.075, 0.075, 0.75])
+goals_space = GoalsSpace('achieved_goal',
+                         ReachingGoal(default_reward=-1, goal_reaching_reward=0,
+                                      distance_from_goal_threshold=distance_from_goal_threshold),
+                         lambda goal, state: np.abs(goal - state))  # raw L1 distance
+
+# top agent
+top_agent_params = HACDDPGAgentParameters()
+
+top_agent_params.memory = EpisodicHRLHindsightExperienceReplayParameters()
+top_agent_params.memory.max_size = (MemoryGranularity.Transitions, 10000000)
+top_agent_params.memory.hindsight_transitions_per_regular_transition = 3
+top_agent_params.memory.hindsight_goal_selection_method = HindsightGoalSelectionMethod.Future
+top_agent_params.memory.goals_space = goals_space
+top_agent_params.algorithm.num_consecutive_playing_steps = EnvironmentEpisodes(32)
+top_agent_params.algorithm.num_consecutive_training_steps = 40
+top_agent_params.algorithm.num_steps_between_copying_online_weights_to_target = TrainingSteps(40)
+
+# exploration - OU process
+top_agent_params.exploration = OUProcessParameters()
+top_agent_params.exploration.theta = 0.1
+
+# actor
+top_actor = top_agent_params.network_wrappers['actor']
+top_actor.input_embedders_parameters = {'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+                                        'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)}
+top_actor.middleware_parameters.scheme = [Dense([64])] * 3
+top_actor.learning_rate = 0.001
+top_actor.batch_size = 4096
+
+# critic
+top_critic = top_agent_params.network_wrappers['critic']
+top_critic.input_embedders_parameters = {'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+                                         'action': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+                                         'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)}
+top_critic.embedding_merger_type = EmbeddingMergerType.Concat
+top_critic.middleware_parameters.scheme = [Dense([64])] * 3
+top_critic.learning_rate = 0.001
+top_critic.batch_size = 4096
+
+# ----------
+
+# bottom agent
+bottom_agent_params = HACDDPGAgentParameters()
+
+# TODO: we should do this is a cleaner way. probably HACGraphManager, should set this for all non top-level agents
+bottom_agent_params.algorithm.in_action_space = goals_space
+
+bottom_agent_params.memory = EpisodicHindsightExperienceReplayParameters()
+bottom_agent_params.memory.max_size = (MemoryGranularity.Transitions, 12000000)
+bottom_agent_params.memory.hindsight_transitions_per_regular_transition = 4
+bottom_agent_params.memory.hindsight_goal_selection_method = HindsightGoalSelectionMethod.Future
+bottom_agent_params.memory.goals_space = goals_space
+bottom_agent_params.algorithm.num_consecutive_playing_steps = EnvironmentEpisodes(16 * 25)  # 25 episodes is one true env episode
+bottom_agent_params.algorithm.num_consecutive_training_steps = 40
+bottom_agent_params.algorithm.num_steps_between_copying_online_weights_to_target = TrainingSteps(40)
+
+bottom_agent_params.exploration = EGreedyParameters()
+bottom_agent_params.exploration.epsilon_schedule = ConstantSchedule(0.2)
+bottom_agent_params.exploration.evaluation_epsilon = 0
+bottom_agent_params.exploration.continuous_exploration_policy_parameters = OUProcessParameters()
+bottom_agent_params.exploration.continuous_exploration_policy_parameters.theta = 0.1
+
+# actor
+bottom_actor = bottom_agent_params.network_wrappers['actor']
+bottom_actor.input_embedders_parameters = {'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+                                           'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)}
+bottom_actor.middleware_parameters.scheme = [Dense([64])] * 3
+bottom_actor.learning_rate = 0.001
+bottom_actor.batch_size = 4096
+
+# critic
+bottom_critic = bottom_agent_params.network_wrappers['critic']
+bottom_critic.input_embedders_parameters = {'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+                                            'action': InputEmbedderParameters(scheme=EmbedderScheme.Empty),
+                                            'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)}
+bottom_critic.embedding_merger_type = EmbeddingMergerType.Concat
+bottom_critic.middleware_parameters.scheme = [Dense([64])] * 3
+bottom_critic.learning_rate = 0.001
+bottom_critic.batch_size = 4096
+
+agents_params = [top_agent_params, bottom_agent_params]
+
+###############
+# Environment #
+###############
+time_limit = 1000
+
+env_params = Mujoco()
+env_params.level = "rl_coach.environments.mujoco.pendulum_with_goals:PendulumWithGoals"
+env_params.additional_simulator_parameters = {"time_limit": time_limit,
+                                              "random_goals_instead_of_standing_goal": False,
+                                              "polar_coordinates": polar_coordinates,
+                                              "goal_reaching_thresholds": distance_from_goal_threshold}
+env_params.frame_skip = 10
+env_params.custom_reward_threshold = -time_limit + 1
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST)]
+vis_params.dump_mp4 = False
+vis_params.native_rendering = False
+
+graph_manager = HACGraphManager(agents_params=agents_params, env_params=env_params,
+                                schedule_params=schedule_params, vis_params=vis_params,
+                                consecutive_steps_to_run_non_top_levels=EnvironmentSteps(40))
diff --git a/rl_coach/presets/Starcraft_CollectMinerals_A3C.py b/rl_coach/presets/Starcraft_CollectMinerals_A3C.py
new file mode 100644
index 0000000..ad41a7e
--- /dev/null
+++ b/rl_coach/presets/Starcraft_CollectMinerals_A3C.py
@@ -0,0 +1,65 @@
+from rl_coach.agents.actor_critic_agent import ActorCriticAgentParameters
+from rl_coach.agents.policy_optimization_agent import PolicyGradientRescaler
+from rl_coach.base_parameters import VisualizationParameters, InputEmbedderParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, AlwaysDumpMethod
+from rl_coach.environments.starcraft2_environment import StarCraft2EnvironmentParameters
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.schedules import LinearSchedule, ConstantSchedule
+
+from rl_coach.core_types import RunPhase
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps
+from rl_coach.exploration_policies.additive_noise import AdditiveNoiseParameters
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(50)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(0)
+
+#########
+# Agent #
+#########
+agent_params = ActorCriticAgentParameters()
+
+agent_params.algorithm.policy_gradient_rescaler = PolicyGradientRescaler.GAE
+agent_params.algorithm.apply_gradients_every_x_episodes = 1
+agent_params.algorithm.num_steps_between_gradient_updates = 20
+agent_params.algorithm.gae_lambda = 0.96
+agent_params.algorithm.beta_entropy = 0
+
+agent_params.network_wrappers['main'].clip_gradients = 10.0
+agent_params.network_wrappers['main'].learning_rate = 0.00001
+# agent_params.network_wrappers['main'].batch_size = 20
+agent_params.network_wrappers['main'].input_embedders_parameters = {
+    "screen": InputEmbedderParameters(input_rescaling={'image': 3.0})
+}
+
+agent_params.exploration = AdditiveNoiseParameters()
+agent_params.exploration.noise_percentage_schedule = ConstantSchedule(0.05)
+# agent_params.exploration.noise_percentage_schedule = LinearSchedule(0.4, 0.05, 100000)
+agent_params.exploration.evaluation_noise_percentage = 0.05
+
+agent_params.network_wrappers['main'].batch_size = 64
+agent_params.network_wrappers['main'].optimizer_epsilon = 1e-5
+agent_params.network_wrappers['main'].adam_optimizer_beta2 = 0.999
+
+###############
+# Environment #
+###############
+
+env_params = StarCraft2EnvironmentParameters()
+env_params.level = 'CollectMineralShards'
+env_params.feature_screen_maps_to_use = [5]
+env_params.feature_minimap_maps_to_use = [5]
+
+vis_params = VisualizationParameters()
+# vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST),MaxDumpMethod()]
+vis_params.dump_mp4 = True
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/Starcraft_CollectMinerals_Dueling_DDQN.py b/rl_coach/presets/Starcraft_CollectMinerals_Dueling_DDQN.py
new file mode 100644
index 0000000..ebc7f49
--- /dev/null
+++ b/rl_coach/presets/Starcraft_CollectMinerals_Dueling_DDQN.py
@@ -0,0 +1,65 @@
+from collections import OrderedDict
+
+from rl_coach.architectures.tensorflow_components.heads.dueling_q_head import DuelingQHeadParameters
+from rl_coach.base_parameters import VisualizationParameters, InputEmbedderParameters
+from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod
+from rl_coach.environments.starcraft2_environment import StarCraft2EnvironmentParameters
+from rl_coach.filters.action.box_discretization import BoxDiscretization
+from rl_coach.filters.filter import OutputFilter
+from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
+from rl_coach.graph_managers.graph_manager import ScheduleParameters
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.schedules import LinearSchedule
+
+from rl_coach.agents.ddqn_agent import DDQNAgentParameters
+from rl_coach.core_types import RunPhase
+from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps
+
+####################
+# Graph Scheduling #
+####################
+schedule_params = ScheduleParameters()
+schedule_params.improve_steps = TrainingSteps(10000000000)
+schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(50)
+schedule_params.evaluation_steps = EnvironmentEpisodes(1)
+schedule_params.heatup_steps = EnvironmentSteps(50000)
+
+#########
+# Agent #
+#########
+agent_params = DDQNAgentParameters()
+
+agent_params.network_wrappers['main'].learning_rate = 0.0001
+agent_params.network_wrappers['main'].input_embedders_parameters = {
+    "screen": InputEmbedderParameters(input_rescaling={'image': 3.0})
+}
+agent_params.network_wrappers['main'].heads_parameters = [DuelingQHeadParameters()]
+agent_params.memory.max_size = (MemoryGranularity.Transitions, 1000000)
+# slave_agent_params.algorithm.num_steps_between_copying_online_weights_to_target = EnvironmentSteps(10000)
+agent_params.exploration.epsilon_schedule = LinearSchedule(1.0, 0.1, 1000000)
+agent_params.algorithm.num_consecutive_playing_steps = EnvironmentSteps(4)
+agent_params.output_filter = \
+    OutputFilter(
+        action_filters=OrderedDict([
+            ('discretization', BoxDiscretization(num_bins_per_dimension=4, force_int_bins=True))
+        ]),
+        is_a_reference_filter=False
+    )
+
+
+###############
+# Environment #
+###############
+
+env_params = StarCraft2EnvironmentParameters()
+env_params.level = 'CollectMineralShards'
+env_params.feature_screen_maps_to_use = [5]
+env_params.feature_minimap_maps_to_use = [5]
+
+vis_params = VisualizationParameters()
+vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]
+vis_params.dump_mp4 = False
+# vis_params.dump_in_episode_signals = True
+
+graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,
+                                    schedule_params=schedule_params, vis_params=vis_params)
diff --git a/rl_coach/presets/__init__.py b/rl_coach/presets/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/renderer.py b/rl_coach/renderer.py
similarity index 82%
rename from renderer.py
rename to rl_coach/renderer.py
index fee19af..91a3c7a 100644
--- a/renderer.py
+++ b/rl_coach/renderer.py
@@ -1,6 +1,22 @@
-import pygame
-from pygame.locals import *
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
 import numpy as np
+import pygame
+from pygame.locals import HWSURFACE, DOUBLEBUF
 
 
 class Renderer(object):
diff --git a/rl_coach/schedules.py b/rl_coach/schedules.py
new file mode 100644
index 0000000..f80f291
--- /dev/null
+++ b/rl_coach/schedules.py
@@ -0,0 +1,125 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import List, Tuple
+
+import numpy as np
+
+from rl_coach.core_types import EnvironmentSteps
+
+
+class Schedule(object):
+    def __init__(self, initial_value: float):
+        self.initial_value = initial_value
+        self.current_value = initial_value
+
+    def step(self):
+        raise NotImplementedError("")
+
+
+class ConstantSchedule(Schedule):
+    def __init__(self, initial_value: float):
+        super().__init__(initial_value)
+
+    def step(self):
+        pass
+
+
+class LinearSchedule(Schedule):
+    """
+    A simple linear schedule which decreases or increases over time from an initial to a final value
+    """
+    def __init__(self, initial_value: float, final_value: float, decay_steps: int):
+        """
+        :param initial_value: the initial value
+        :param final_value: the final value
+        :param decay_steps: the number of steps that are required to decay the initial value to the final value
+        """
+        super().__init__(initial_value)
+        self.final_value = final_value
+        self.decay_steps = decay_steps
+        self.decay_delta = (initial_value - final_value) / float(decay_steps)
+
+    def step(self):
+        self.current_value -= self.decay_delta
+        # decreasing schedule
+        if self.final_value < self.initial_value:
+            self.current_value = np.clip(self.current_value, self.final_value, self.initial_value)
+        # increasing schedule
+        if self.final_value > self.initial_value:
+            self.current_value = np.clip(self.current_value, self.initial_value, self.final_value)
+
+
+class PieceWiseSchedule(Schedule):
+    """
+    A schedule which consists of multiple sub-schedules, where each one is used for a defined number of steps
+    """
+    def __init__(self, schedules: List[Tuple[Schedule, EnvironmentSteps]]):
+        """
+        :param schedules: a list of schedules to apply serially. Each element of the list should be a tuple of
+                          2 elements - a schedule and the number of steps to run it in terms of EnvironmentSteps
+        """
+        super().__init__(schedules[0][0].initial_value)
+        self.schedules = schedules
+        self.current_schedule = schedules[0]
+        self.current_schedule_idx = 0
+        self.current_schedule_step_count = 0
+
+    def step(self):
+        self.current_schedule[0].step()
+
+        if self.current_schedule_idx < len(self.schedules) - 1 \
+                and self.current_schedule_step_count >= self.current_schedule[1].num_steps:
+            self.current_schedule_idx += 1
+            self.current_schedule = self.schedules[self.current_schedule_idx]
+            self.current_schedule_step_count = 0
+
+        self.current_value = self.current_schedule[0].current_value
+        self.current_schedule_step_count += 1
+
+
+class ExponentialSchedule(Schedule):
+    """
+    A simple exponential schedule which decreases or increases over time from an initial to a final value
+    """
+    def __init__(self, initial_value: float, final_value: float, decay_coefficient: float):
+        """
+        :param initial_value: the initial value
+        :param final_value: the final value
+        :param decay_coefficient: the exponential decay coefficient
+        """
+        super().__init__(initial_value)
+        self.initial_value = initial_value
+        self.final_value = final_value
+        self.decay_coefficient = decay_coefficient
+        self.current_step = 0
+        self.current_value = self.initial_value
+        if decay_coefficient < 1 and final_value > initial_value:
+            raise ValueError("The final value should be lower than the initial value when the decay coefficient < 1")
+        if decay_coefficient > 1 and initial_value > final_value:
+            raise ValueError("The final value should be higher than the initial value when the decay coefficient > 1")
+
+    def step(self):
+        self.current_value *= self.decay_coefficient
+
+        # decreasing schedule
+        if self.final_value < self.initial_value:
+            self.current_value = np.clip(self.current_value, self.final_value, self.initial_value)
+        # increasing schedule
+        if self.final_value > self.initial_value:
+            self.current_value = np.clip(self.current_value, self.initial_value, self.final_value)
+
+        self.current_step += 1
diff --git a/rl_coach/spaces.py b/rl_coach/spaces.py
new file mode 100644
index 0000000..404ee7d
--- /dev/null
+++ b/rl_coach/spaces.py
@@ -0,0 +1,605 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import random
+from enum import Enum
+from itertools import product
+from typing import Union, List, Dict, Tuple, Callable
+
+import numpy as np
+import scipy
+import scipy.spatial
+from rl_coach.utils import eps
+
+from rl_coach.core_types import ActionType, ActionInfo
+
+
+class Space(object):
+    """
+    A space defines a set of valid values
+    """
+    def __init__(self, shape: Union[int, tuple, list, np.ndarray], low: Union[None, int, float, np.ndarray]=-np.inf,
+                 high: Union[None, int, float, np.ndarray]=np.inf):
+        """
+        :param shape: the shape of the space
+        :param low: the lowest values possible in the space. can be an array defining the lowest values per point,
+                    or a single value defining the general lowest values
+        :param high: the highest values possible in the space. can be an array defining the highest values per point,
+                    or a single value defining the general highest values
+        """
+
+        # the number of dimensions is the number of axes in the shape. it will be set in the shape setter
+        self.num_dimensions = 0
+
+        # the number of elements is the number of possible actions if the action space was discrete.
+        # it will be set in the shape setter
+        self.num_elements = 0
+
+        self._low = self._high = None
+        self._shape = self.shape = shape
+        self._low = self.low = low
+        self._high = self.high = high
+
+        # we allow zero sized spaces which means that the space is empty. this is useful for environments with no
+        # measurements for example.
+        if type(shape) == int and shape < 0:
+            raise ValueError("The shape of the space must be a non-negative number")
+
+    @property
+    def shape(self):
+        return self._shape
+
+    @shape.setter
+    def shape(self, val: Union[int, tuple, list, np.ndarray]):
+        # convert the shape to an np.ndarray
+        self._shape = val
+        if type(self._shape) == int:
+            self._shape = np.array([self._shape])
+        if type(self._shape) == tuple or type(self._shape) == list:
+            self._shape = np.array(self._shape)
+
+        # the shape is now an np.ndarray
+        self.num_dimensions = len(self._shape)
+        self.num_elements = int(np.prod(self._shape))
+
+    @property
+    def low(self):
+        if hasattr(self, '_low'):
+            return self._low
+        else:
+            return None
+
+    @low.setter
+    def low(self, val: Union[None, int, float, np.ndarray]):
+        if type(val) == np.ndarray and type(self.shape) == np.ndarray and np.all(val.shape != self.shape):
+            raise ValueError("The low values shape don't match the shape of the space")
+        elif self.high is not None and not np.all(self.high >= val):
+            raise ValueError("At least one of the axes-parallel lines defining the space has high values which "
+                             "are lower than the given low values")
+        else:
+            self._low = val
+            # we allow using a number to define the low values, but we immediately convert it to an array which defines
+            # the low values for all the space dimensions in order to expose a consistent value type
+            if type(self._low) == int or type(self._low) == float:
+                self._low = np.ones(self.shape)*self._low
+
+    @property
+    def high(self):
+        if hasattr(self, '_high'):
+            return self._high
+        else:
+            return None
+
+    @high.setter
+    def high(self, val: Union[None, int, float, np.ndarray]):
+        if type(val) == np.ndarray and type(self.shape) == np.ndarray and np.all(val.shape != self.shape):
+            raise ValueError("The high values shape don't match the shape of the space")
+        elif self.low is not None and not np.all(self.low <= val):
+            raise ValueError("At least one of the axes-parallel lines defining the space has low values which "
+                             "are higher than the given high values")
+        else:
+            self._high = val
+            # we allow using a number to define the high values, but we immediately convert it to an array which defines
+            # the high values for all the space dimensions in order to expose a consistent value type
+            if type(self._high) == int or type(self._high) == float:
+                self._high = np.ones(self.shape)*self._high
+
+    def val_matches_space_definition(self, val: Union[int, float, np.ndarray]) -> bool:
+        """
+        Checks if the given value matches the space definition in terms of shape and values
+        :param val: a value to check
+        :return: True / False depending on if the val matches the space definition
+        """
+        if (type(val) == int or type(val) == float) and not np.all(self.shape == np.ones(1)):
+            return False
+        if type(val) == np.ndarray and not np.all(val.shape == self.shape):
+            return False
+        if (self.low is not None and not np.all(val >= self.low)) \
+                or (self.high is not None and not np.all(val <= self.high)):
+            # TODO: check the performance overhead this causes
+            return False
+        return True
+
+    def is_point_in_space_shape(self, point: np.ndarray) -> bool:
+        """
+        Checks if a given multidimensional point is within the bounds of the shape of the space
+        :param point: a multidimensional point
+        :return: True if the point is within the shape of the space. False otherwise
+        """
+        if len(point) != self.num_dimensions:
+            return False
+        if np.any(point < np.zeros(self.num_dimensions)) or np.any(point >= self.shape):
+            return False
+        return True
+
+    def sample(self) -> np.ndarray:
+        # if there are infinite bounds, we sample using gaussian noise with mean 0 and std 1
+        if np.any(self.low == -np.inf) or np.any(self.high == np.inf):
+            return np.random.normal(0, 1, self.shape)
+        else:
+            return np.random.uniform(self.low, self.high, self.shape)
+
+
+class RewardSpace(Space):
+    def __init__(self, shape: Union[int, np.ndarray], low: Union[None, int, float, np.ndarray]=-np.inf,
+                 high: Union[None, int, float, np.ndarray]=np.inf,
+                 reward_success_threshold: Union[None, int, float]=None):
+        super().__init__(shape, low, high)
+        self.reward_success_threshold = reward_success_threshold
+
+
+"""
+Observation Spaces
+"""
+
+
+class ObservationSpace(Space):
+    def __init__(self, shape: Union[int, np.ndarray], low: Union[None, int, float, np.ndarray]=-np.inf,
+                 high: Union[None, int, float, np.ndarray]=np.inf):
+        super().__init__(shape, low, high)
+
+
+class VectorObservationSpace(ObservationSpace):
+    def __init__(self, shape: int, low: Union[None, int, float, np.ndarray]=-np.inf,
+                 high: Union[None, int, float, np.ndarray]=np.inf, measurements_names: List[str]=None):
+        if measurements_names is None:
+            measurements_names = []
+        if len(measurements_names) > shape:
+            raise ValueError("measurement_names size {} is larger than shape {}.".format(
+                len(measurements_names), shape))
+
+        self.measurements_names = measurements_names
+        super().__init__(shape, low, high)
+
+
+class PlanarMapsObservationSpace(ObservationSpace):
+    def __init__(self, shape: Union[np.ndarray], low: int, high: int, channels_axis: int=-1):
+        super().__init__(shape, low, high)
+        self.channels_axis = channels_axis
+
+        if not 2 <= len(shape) <= 3:
+            raise ValueError("Planar maps observations must have 3 dimensions - a channels dimension and 2 maps "
+                             "dimensions, not {}".format(len(shape)))
+        if len(shape) == 2:
+            self.channels = 1
+        else:
+            self.channels = shape[channels_axis]
+
+
+class ImageObservationSpace(PlanarMapsObservationSpace):
+    def __init__(self, shape: Union[np.ndarray], high: int, channels_axis: int=-1):
+        # TODO: consider allowing arbitrary low values for images
+        super().__init__(shape, 0, high, channels_axis)
+        self.has_colors = self.channels == 3
+        if not self.channels == 3 and not self.channels == 1:
+            raise ValueError("Image observations must have 1 or 3 channels, not {}".format(self.channels))
+
+
+# TODO: mixed observation spaces (image + measurements, image + segmentation + depth map, etc.)
+class StateSpace(object):
+    def __init__(self, sub_spaces: Dict[str, Space]):
+        self.sub_spaces = sub_spaces
+
+    def __getitem__(self, item):
+        return self.sub_spaces[item]
+
+    def __setitem__(self, key, value):
+        self.sub_spaces[key] = value
+
+
+"""
+Action Spaces
+"""
+
+
+class ActionSpace(Space):
+    def __init__(self, shape: Union[int, np.ndarray], low: Union[None, int, float, np.ndarray]=-np.inf,
+                 high: Union[None, int, float, np.ndarray]=np.inf, descriptions: Union[None, List, Dict]=None,
+                 default_action: ActionType=None):
+        super().__init__(shape, low, high)
+        # we allow a mismatch between the number of descriptions and the number of actions.
+        # in this case the descriptions for the actions that were not given will be the action index
+        if descriptions is not None:
+            self.descriptions = descriptions
+        else:
+            self.descriptions = {}
+        self.default_action = default_action
+
+    @property
+    def actions(self) -> List[ActionType]:
+        raise NotImplementedError("The action space does not have an explicit actions list")
+
+    def sample_with_info(self) -> ActionInfo:
+        """
+        Get a random action with additional "fake" info
+        :return: An action info instance
+        """
+        return ActionInfo(self.sample())
+
+    def clip_action_to_space(self, action: ActionType) -> ActionType:
+        """
+        Given an action, clip its values to fit to the action space ranges
+        :param action: a given action
+        :return: the clipped action
+        """
+        return action
+
+    def get_description(self, action: np.ndarray) -> str:
+        raise NotImplementedError("")
+
+    def __str__(self):
+        return "{}: shape = {}, low = {}, high = {}".format(self.__class__.__name__, self.shape, self.low, self.high)
+
+    def __repr__(self):
+        return self.__str__()
+
+
+class AttentionActionSpace(ActionSpace):
+    """
+    A box selection continuous action space, meaning that the actions are defined as selecting a multidimensional box
+    from a given range.
+    The actions will be in the form:
+    [[low_x, low_y, ...], [high_x, high_y, ...]]
+    """
+    def __init__(self, shape: int, low: Union[None, int, float, np.ndarray]=-np.inf,
+                 high: Union[None, int, float, np.ndarray]=np.inf, descriptions: Union[None, List, Dict]=None,
+                 default_action: np.ndarray = None, forced_attention_size: Union[None, int, float, np.ndarray]=None):
+        super().__init__(shape, low, high, descriptions)
+
+        self.forced_attention_size = forced_attention_size
+        if isinstance(self.forced_attention_size, int) or isinstance(self.forced_attention_size, float):
+            self.forced_attention_size = np.ones(self.shape) * self.forced_attention_size
+
+        if self.forced_attention_size is not None and np.all(self.forced_attention_size > (self.high - self.low)):
+            raise ValueError("The forced attention size is larger than the action space")
+
+        # default action
+        if default_action is None:
+            if self.forced_attention_size is not None:
+                self.default_action = [self.low*np.ones(self.shape),
+                                       (self.low+self.forced_attention_size)*np.ones(self.shape)]
+            else:
+                self.default_action = [self.low*np.ones(self.shape), self.high*np.ones(self.shape)]
+        else:
+            self.default_action = default_action
+
+    def sample(self) -> List:
+        if self.forced_attention_size is not None:
+            sampled_low = np.random.uniform(self.low, self.high-self.forced_attention_size, self.shape)
+            sampled_high = sampled_low + self.forced_attention_size
+        else:
+            sampled_low = np.random.uniform(self.low, self.high, self.shape)
+            sampled_high = np.random.uniform(sampled_low, self.high, self.shape)
+        return [sampled_low, sampled_high]
+
+    def clip_action_to_space(self, action: ActionType) -> ActionType:
+        action = [np.clip(action[0], self.low, self.high), np.clip(action[1], self.low, self.high)]
+        return action
+
+
+class BoxActionSpace(ActionSpace):
+    """
+    A multidimensional bounded or unbounded continuous action space
+    """
+    def __init__(self, shape: Union[int, np.ndarray], low: Union[None, int, float, np.ndarray]=-np.inf,
+                 high: Union[None, int, float, np.ndarray]=np.inf, descriptions: Union[None, List, Dict]=None,
+                 default_action: np.ndarray=None):
+        super().__init__(shape, low, high, descriptions)
+        self.max_abs_range = np.maximum(np.abs(self.low), np.abs(self.high))
+
+        # default action
+        if default_action is None:
+            if np.any(np.isinf(self.low)) or np.any(np.isinf(self.high)):
+                self.default_action = np.zeros(shape)
+            else:
+                self.default_action = self.low + (self.high - self.low) / 2
+        else:
+            self.default_action = default_action
+
+    def clip_action_to_space(self, action: ActionType) -> ActionType:
+        action = np.clip(action, self.low, self.high)
+        return action
+
+
+class DiscreteActionSpace(ActionSpace):
+    """
+    A discrete action space with action indices as actions
+    """
+    def __init__(self, num_actions: int, descriptions: Union[None, List, Dict]=None, default_action: np.ndarray=None):
+        super().__init__(1, low=0, high=num_actions-1, descriptions=descriptions)
+        # the number of actions is mapped to high
+
+        # default action
+        if default_action is None:
+            self.default_action = 0
+        else:
+            self.default_action = default_action
+
+    @property
+    def actions(self) -> List[ActionType]:
+        return list(range(0, int(self.high[0]) + 1))
+
+    def sample(self) -> int:
+        return np.random.choice(self.actions)
+
+    def sample_with_info(self) -> ActionInfo:
+        return ActionInfo(self.sample(), action_probability=1. / (self.high[0] - self.low[0] + 1))
+
+    def get_description(self, action: int) -> str:
+        if type(self.descriptions) == list and 0 <= action < len(self.descriptions):
+            return self.descriptions[action]
+        elif type(self.descriptions) == dict and action in self.descriptions.keys():
+            return self.descriptions[action]
+        elif 0 <= action < self.shape:
+            return str(action)
+        else:
+            raise ValueError("The given action is outside of the action space")
+
+
+class MultiSelectActionSpace(ActionSpace):
+    """
+    A discrete action space where multiple actions can be selected at once. The actions are encoded as multi-hot vectors
+    """
+    def __init__(self, size: int, max_simultaneous_selected_actions: int=1, descriptions: Union[None, List, Dict]=None,
+                 default_action: np.ndarray=None, allow_no_action_to_be_selected=True):
+        super().__init__(size, low=None, high=None, descriptions=descriptions)
+        self.max_simultaneous_selected_actions = max_simultaneous_selected_actions
+
+        if max_simultaneous_selected_actions > size:
+            raise ValueError("The maximum simultaneous selected actions can't be larger the max number of actions")
+
+        # create all combinations of actions as a list of actions
+        I = [np.eye(size)]*self.max_simultaneous_selected_actions
+        self._actions = []
+        if allow_no_action_to_be_selected:
+            self._actions.append(np.zeros(size))
+        self._actions.extend(list(np.unique([np.clip(np.sum(t, axis=0), 0, 1) for t in product(*I)], axis=0)))
+
+        # default action
+        if default_action is None:
+            self.default_action = self._actions[0]
+        else:
+            self.default_action = default_action
+
+    @property
+    def actions(self) -> List[ActionType]:
+        return self._actions
+
+    def sample(self) -> np.ndarray:
+        # samples a multi-hot vector
+        return random.choice(self.actions)
+
+    def sample_with_info(self) -> ActionInfo:
+        return ActionInfo(self.sample(), action_probability=1. / len(self.actions))
+
+    def get_description(self, action: np.ndarray) -> str:
+        if np.sum(len(np.where(action == 0)[0])) + np.sum(len(np.where(action == 1)[0])) != self.shape or \
+                        np.sum(len(np.where(action == 1)[0])) > self.max_simultaneous_selected_actions:
+            raise ValueError("The given action is not in the action space")
+        selected_actions = np.where(action == 1)[0]
+        description = [self.descriptions[a] for a in selected_actions]
+        if len(description) == 0:
+            description = ['no-op']
+        return ' + '.join(description)
+
+
+class CompoundActionSpace(ActionSpace):
+    """
+    An action space which consists of multiple sub-action spaces.
+    For example, in Starcraft the agent should choose an action identifier from ~550 options (Discrete(550)),
+    but it also needs to choose 13 different arguments for the selected action identifier, where each argument is
+    by itself an action space. In Starcraft, the arguments are Discrete action spaces as well, but this is not mandatory.
+    """
+    def __init__(self, sub_spaces: List[ActionSpace]):
+        super().__init__(0)
+        self.sub_action_spaces = sub_spaces
+        # TODO: define the shape, low and high value in a better way
+
+    @property
+    def actions(self) -> List[ActionType]:
+        return [action_space.actions for action_space in self.sub_action_spaces]
+
+    def sample(self) -> ActionType:
+        return [action_space.sample() for action_space in self.sub_action_spaces]
+
+    def clip_action_to_space(self, actions: List[ActionType]) -> ActionType:
+        if not isinstance(actions, list) or len(actions) != len(self.sub_action_spaces):
+            raise ValueError("The actions to be clipped must be a list with the same number of sub-actions as "
+                             "defined in the compound action space.")
+        for idx in range(len(self.sub_action_spaces)):
+            actions[idx] = self.sub_action_spaces[idx].clip_action_to_space(actions[idx])
+        return actions
+
+    def get_description(self, actions: np.ndarray) -> str:
+        description = [action_space.get_description(action) for action_space, action in zip(self.sub_action_spaces, actions)]
+        return ' + '.join(description)
+
+
+"""
+Goals
+"""
+
+
+class GoalToRewardConversion(object):
+    def __init__(self, goal_reaching_reward: float=0):
+        self.goal_reaching_reward = goal_reaching_reward
+
+    def convert_distance_to_reward(self, distance: Union[float, np.ndarray]) -> Tuple[float, bool]:
+        """
+        Given a distance from the goal, return a reward and a flag representing if the goal was reached
+        :param distance: the distance from the goal
+        :return:
+        """
+        raise NotImplementedError("")
+
+
+class ReachingGoal(GoalToRewardConversion):
+    """
+    get a reward if the goal was reached and 0 otherwise
+    """
+    def __init__(self, distance_from_goal_threshold: Union[float, np.ndarray], goal_reaching_reward: float=0,
+                 default_reward: float=-1):
+        """
+        :param distance_from_goal_threshold: consider getting to this distance from the goal the same as getting
+                                             to the goal
+        :param goal_reaching_reward: the reward the agent will get when reaching the goal
+        :param default_reward: the reward the agent will get until it reaches the goal
+        """
+        super().__init__(goal_reaching_reward)
+        self.distance_from_goal_threshold = distance_from_goal_threshold
+        self.default_reward = default_reward
+
+    def convert_distance_to_reward(self, distance: Union[float, np.ndarray]) -> Tuple[float, bool]:
+        if np.all(distance <= self.distance_from_goal_threshold):
+            return self.goal_reaching_reward, True
+        else:
+            return self.default_reward, False
+
+
+class InverseDistanceFromGoal(GoalToRewardConversion):
+    """
+    get a reward inversely proportional to the distance from the goal
+    """
+    def __init__(self, distance_from_goal_threshold: Union[float, np.ndarray], max_reward: float=1):
+        """
+        :param distance_from_goal_threshold: consider getting to this distance from the goal the same as getting
+                                             to the goal
+        :param max_reward: the max reward the agent can get
+        """
+        super().__init__(goal_reaching_reward=max_reward)
+        self.distance_from_goal_threshold = distance_from_goal_threshold
+        self.max_reward = max_reward
+
+    def convert_distance_to_reward(self, distance: Union[float, np.ndarray]) -> Tuple[float, bool]:
+        return min(self.max_reward, 1 / (distance + eps)), distance <= self.distance_from_goal_threshold
+
+
+class GoalsSpace(VectorObservationSpace, ActionSpace):
+    """
+    A multidimensional space with a goal type definition. It also behaves as an action space, so that hierarchical
+    agents can use it as an output action space.
+    The class acts as a wrapper to the target space. So after setting the target space, all the values of the class
+    will match the values of the target space (the shape, low, high, etc.)
+    """
+    class DistanceMetric(Enum):
+        Euclidean = 0
+        Cosine = 1
+        Manhattan = 2
+
+    def __init__(self, goal_name: str, reward_type: GoalToRewardConversion,
+                 distance_metric: Union[DistanceMetric, Callable]):
+        """
+        :param goal_name: the name of the observation space to use as the achieved goal.
+        :param reward_type: the reward type to use for converting distances from goal to rewards
+        :param distance_metric: the distance metric to use. could be either one of the distances in the
+                                DistanceMetric enum, or a custom function that gets two vectors as input and
+                                returns the distance between them
+        """
+        super().__init__(0)
+        self.goal_name = goal_name
+        self.distance_metric = distance_metric
+        self.reward_type = reward_type
+        self.target_space = None
+        self.max_abs_range = None
+
+    def set_target_space(self, target_space: Space) -> None:
+        self.target_space = target_space
+        super().__init__(self.target_space.shape, self.target_space.low, self.target_space.high)
+        self.max_abs_range = np.maximum(np.abs(self.low), np.abs(self.high))
+
+    def goal_from_state(self, state: Dict):
+        """
+        Given a state, extract an observation according to the goal_name
+        :param state: a dictionary of observations
+        :return: the observation corresponding to the goal_name
+        """
+        return state[self.goal_name]
+
+    def distance_from_goal(self, goal: np.ndarray, state: dict) -> float:
+        """
+        Given a state, check its distance from the goal
+        :param goal: a numpy array representing the goal
+        :param state: a dict representing the state
+        :return: the distance from the goal
+        """
+        state_value = self.goal_from_state(state)
+
+        # calculate distance
+        if self.distance_metric == self.DistanceMetric.Cosine:
+            dist = scipy.spatial.distance.cosine(goal, state_value)
+        elif self.distance_metric == self.DistanceMetric.Euclidean:
+            dist = scipy.spatial.distance.euclidean(goal, state_value)
+        elif self.distance_metric == self.DistanceMetric.Manhattan:
+            dist = scipy.spatial.distance.cityblock(goal, state_value)
+        elif callable(self.distance_metric):
+            dist = self.distance_metric(goal, state_value)
+        else:
+            raise ValueError("The given distance metric for the goal is not valid.")
+
+        return dist
+
+    def get_reward_for_goal_and_state(self, goal: np.ndarray, state: dict) -> Tuple[float, bool]:
+        """
+        Given a state, check if the goal was reached and return a reward accordingly
+        :param goal: a numpy array representing the goal
+        :param state: a dict representing the state
+        :return: the reward for the current goal and state pair and a boolean representing if the goal was reached
+        """
+        dist = self.distance_from_goal(goal, state)
+        return self.reward_type.convert_distance_to_reward(dist)
+
+
+class AgentSelection(DiscreteActionSpace):
+    """
+    An discrete action space which is bounded by the number of agents to select from
+    """
+    def __init__(self, num_agents: int):
+        super().__init__(num_agents)
+
+
+class SpacesDefinition(object):
+    """
+    A container class that allows passing the definitions of all the spaces at once
+    """
+    def __init__(self,
+                 state: StateSpace,
+                 goal: ObservationSpace,
+                 action: ActionSpace,
+                 reward: RewardSpace):
+        self.state = state
+        self.goal = goal
+        self.action = action
+        self.reward = reward
diff --git a/rl_coach/tests/__init__.py b/rl_coach/tests/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/agents/__init__.py b/rl_coach/tests/agents/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/agents/test_agent_external_communication.py b/rl_coach/tests/agents/test_agent_external_communication.py
new file mode 100644
index 0000000..0bef271
--- /dev/null
+++ b/rl_coach/tests/agents/test_agent_external_communication.py
@@ -0,0 +1,33 @@
+import os
+import sys
+
+from rl_coach.base_parameters import TaskParameters
+
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+import tensorflow as tf
+from tensorflow import logging
+import pytest
+logging.set_verbosity(logging.INFO)
+
+
+@pytest.mark.unit_test
+def test_get_QActionStateValue_predictions():
+    tf.reset_default_graph()
+    from rl_coach.presets.CartPole_DQN import graph_manager as cartpole_dqn_graph_manager
+    assert cartpole_dqn_graph_manager
+    cartpole_dqn_graph_manager.create_graph(task_parameters=
+                                            TaskParameters(framework_type="tensorflow",
+                                                           experiment_path="./experiments/test"))
+    cartpole_dqn_graph_manager.improve_steps.num_steps = 1
+    cartpole_dqn_graph_manager.steps_between_evaluation_periods.num_steps = 5
+
+    # graph_manager.improve()
+    #
+    # agent = graph_manager.level_managers[0].composite_agents['simple_rl_agent'].agents['simple_rl_agent/agent']
+    # some_state = agent.memory.sample(1)[0].state
+    # cartpole_dqn_predictions = agent.get_predictions(states=some_state, prediction_type=QActionStateValue)
+    # assert cartpole_dqn_predictions.shape == (1, 2)
+
+
+if __name__ == '__main__':
+    test_get_QActionStateValue_predictions()
diff --git a/rl_coach/tests/architectures/__init__.py b/rl_coach/tests/architectures/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/architectures/tensorflow_components/__init__.py b/rl_coach/tests/architectures/tensorflow_components/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/architectures/tensorflow_components/embedders/__init__.py b/rl_coach/tests/architectures/tensorflow_components/embedders/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/architectures/tensorflow_components/embedders/test_identity_embedder.py b/rl_coach/tests/architectures/tensorflow_components/embedders/test_identity_embedder.py
new file mode 100644
index 0000000..23ca834
--- /dev/null
+++ b/rl_coach/tests/architectures/tensorflow_components/embedders/test_identity_embedder.py
@@ -0,0 +1,45 @@
+import os
+import sys
+
+from rl_coach.base_parameters import EmbedderScheme
+
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+from rl_coach.architectures.tensorflow_components.embedders.vector_embedder import VectorEmbedder
+import tensorflow as tf
+from tensorflow import logging
+
+logging.set_verbosity(logging.INFO)
+
+@pytest.fixture
+def reset():
+    tf.reset_default_graph()
+
+
+@pytest.mark.unit_test
+def test_embedder(reset):
+    embedder = VectorEmbedder(np.array([10, 10]), name="test", scheme=EmbedderScheme.Empty)
+
+    # make sure the ops where not created yet
+    assert len(tf.get_default_graph().get_operations()) == 0
+
+    # call the embedder
+    input_ph, output_ph = embedder()
+
+    # make sure that now the ops were created
+    assert len(tf.get_default_graph().get_operations()) > 0
+
+    # try feeding a batch of one example  # TODO: consider auto converting to batch
+    input = np.random.rand(1, 10, 10)
+    sess = tf.Session()
+    output = sess.run(embedder.output, {embedder.input: input})
+    assert output.shape == (1, 100)  # should have flattened the input
+
+    # now make sure the returned placeholders behave the same
+    output = sess.run(output_ph, {input_ph: input})
+    assert output.shape == (1, 100)  # should have flattened the input
+
+    # make sure the naming is correct
+    assert embedder.get_name() == "test"
diff --git a/rl_coach/tests/architectures/tensorflow_components/embedders/test_image_embedder.py b/rl_coach/tests/architectures/tensorflow_components/embedders/test_image_embedder.py
new file mode 100644
index 0000000..c4fa08b
--- /dev/null
+++ b/rl_coach/tests/architectures/tensorflow_components/embedders/test_image_embedder.py
@@ -0,0 +1,99 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+from rl_coach.architectures.tensorflow_components.embedders.image_embedder import ImageEmbedder, EmbedderScheme
+import tensorflow as tf
+from tensorflow import logging
+
+logging.set_verbosity(logging.INFO)
+
+@pytest.fixture
+def reset():
+    tf.reset_default_graph()
+
+
+@pytest.mark.unit_test
+def test_embedder(reset):
+    # creating an embedder with a non-image input
+    with pytest.raises(ValueError):
+        embedder = ImageEmbedder(np.array([100]), name="test")
+    with pytest.raises(ValueError):
+        embedder = ImageEmbedder(np.array([100, 100]), name="test")
+    with pytest.raises(ValueError):
+        embedder = ImageEmbedder(np.array([10, 100, 100, 100]), name="test")
+
+    # creating a simple image embedder
+    embedder = ImageEmbedder(np.array([100, 100, 10]), name="test")
+
+    # make sure the ops where not created yet
+    assert len(tf.get_default_graph().get_operations()) == 0
+
+    # call the embedder
+    input_ph, output_ph = embedder()
+
+    # make sure that now the ops were created
+    assert len(tf.get_default_graph().get_operations()) > 0
+
+    # try feeding a batch of one example
+    input = np.random.rand(1, 100, 100, 10)
+    sess = tf.Session()
+    sess.run(tf.global_variables_initializer())
+    output = sess.run(embedder.output, {embedder.input: input})
+    assert output.shape == (1, 5184)
+
+    # now make sure the returned placeholders behave the same
+    output = sess.run(output_ph, {input_ph: input})
+    assert output.shape == (1, 5184)
+
+    # make sure the naming is correct
+    assert embedder.get_name() == "test"
+
+
+@pytest.mark.unit_test
+def test_complex_embedder(reset):
+    # creating a deep vector embedder
+    embedder = ImageEmbedder(np.array([100, 100, 10]), name="test", scheme=EmbedderScheme.Deep)
+
+    # call the embedder
+    embedder()
+
+    # try feeding a batch of one example
+    input = np.random.rand(1, 100, 100, 10)
+    sess = tf.Session()
+    sess.run(tf.global_variables_initializer())
+    output = sess.run(embedder.output, {embedder.input: input})
+    assert output.shape == (1, 256)  # should have flattened the input
+
+
+@pytest.mark.unit_test
+def test_activation_function(reset):
+    # creating a deep image embedder with relu
+    embedder = ImageEmbedder(np.array([100, 100, 10]), name="relu", scheme=EmbedderScheme.Deep,
+                             activation_function=tf.nn.relu)
+
+    # call the embedder
+    embedder()
+
+    # try feeding a batch of one example
+    input = np.random.rand(1, 100, 100, 10)
+    sess = tf.Session()
+    sess.run(tf.global_variables_initializer())
+    output = sess.run(embedder.output, {embedder.input: input})
+    assert np.all(output >= 0)  # should have flattened the input
+
+    # creating a deep image embedder with tanh
+    embedder_tanh = ImageEmbedder(np.array([100, 100, 10]), name="tanh", scheme=EmbedderScheme.Deep,
+                                  activation_function=tf.nn.tanh)
+
+    # call the embedder
+    embedder_tanh()
+
+    # try feeding a batch of one example
+    input = np.random.rand(1, 100, 100, 10)
+    sess = tf.Session()
+    sess.run(tf.global_variables_initializer())
+    output = sess.run(embedder_tanh.output, {embedder_tanh.input: input})
+    assert np.all(output >= -1) and np.all(output <= 1)
diff --git a/rl_coach/tests/architectures/tensorflow_components/embedders/test_vector_embedder.py b/rl_coach/tests/architectures/tensorflow_components/embedders/test_vector_embedder.py
new file mode 100644
index 0000000..4ca4369
--- /dev/null
+++ b/rl_coach/tests/architectures/tensorflow_components/embedders/test_vector_embedder.py
@@ -0,0 +1,95 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+from rl_coach.architectures.tensorflow_components.embedders.vector_embedder import VectorEmbedder, EmbedderScheme
+import tensorflow as tf
+from tensorflow import logging
+
+logging.set_verbosity(logging.INFO)
+
+@pytest.fixture
+def reset():
+    tf.reset_default_graph()
+
+
+@pytest.mark.unit_test
+def test_embedder(reset):
+    # creating a vector embedder with a matrix
+    with pytest.raises(ValueError):
+        embedder = VectorEmbedder(np.array([10, 10]), name="test")
+
+    # creating a simple vector embedder
+    embedder = VectorEmbedder(np.array([10]), name="test")
+
+    # make sure the ops where not created yet
+    assert len(tf.get_default_graph().get_operations()) == 0
+
+    # call the embedder
+    input_ph, output_ph = embedder()
+
+    # make sure that now the ops were created
+    assert len(tf.get_default_graph().get_operations()) > 0
+
+    # try feeding a batch of one example
+    input = np.random.rand(1, 10)
+    sess = tf.Session()
+    sess.run(tf.global_variables_initializer())
+    output = sess.run(embedder.output, {embedder.input: input})
+    assert output.shape == (1, 256)
+
+    # now make sure the returned placeholders behave the same
+    output = sess.run(output_ph, {input_ph: input})
+    assert output.shape == (1, 256)
+
+    # make sure the naming is correct
+    assert embedder.get_name() == "test"
+
+
+@pytest.mark.unit_test
+def test_complex_embedder(reset):
+    # creating a deep vector embedder
+    embedder = VectorEmbedder(np.array([10]), name="test", scheme=EmbedderScheme.Deep)
+
+    # call the embedder
+    embedder()
+
+    # try feeding a batch of one example
+    input = np.random.rand(1, 10)
+    sess = tf.Session()
+    sess.run(tf.global_variables_initializer())
+    output = sess.run(embedder.output, {embedder.input: input})
+    assert output.shape == (1, 128)  # should have flattened the input
+
+
+@pytest.mark.unit_test
+def test_activation_function(reset):
+    # creating a deep vector embedder with relu
+    embedder = VectorEmbedder(np.array([10]), name="relu", scheme=EmbedderScheme.Deep,
+                              activation_function=tf.nn.relu)
+
+    # call the embedder
+    embedder()
+
+    # try feeding a batch of one example
+    input = np.random.rand(1, 10)
+    sess = tf.Session()
+    sess.run(tf.global_variables_initializer())
+    output = sess.run(embedder.output, {embedder.input: input})
+    assert np.all(output >= 0)  # should have flattened the input
+
+    # creating a deep vector embedder with tanh
+    embedder_tanh = VectorEmbedder(np.array([10]), name="tanh", scheme=EmbedderScheme.Deep,
+                                   activation_function=tf.nn.tanh)
+
+    # call the embedder
+    embedder_tanh()
+
+    # try feeding a batch of one example
+    input = np.random.rand(1, 10)
+    sess = tf.Session()
+    sess.run(tf.global_variables_initializer())
+    output = sess.run(embedder_tanh.output, {embedder_tanh.input: input})
+    assert np.all(output >= -1) and np.all(output <= 1)
diff --git a/rl_coach/tests/environments/__init__.py b/rl_coach/tests/environments/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/environments/test_gym_environment.py b/rl_coach/tests/environments/test_gym_environment.py
new file mode 100644
index 0000000..589467e
--- /dev/null
+++ b/rl_coach/tests/environments/test_gym_environment.py
@@ -0,0 +1,67 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+from rl_coach.environments.gym_environment import GymEnvironment
+from rl_coach.base_parameters import VisualizationParameters
+import numpy as np
+from rl_coach.spaces import DiscreteActionSpace, BoxActionSpace, ImageObservationSpace, VectorObservationSpace
+
+
+@pytest.fixture()
+def atari_env():
+    # create a breakout gym environment
+    env = GymEnvironment(level='Breakout-v0',
+                         seed=1,
+                         frame_skip=4,
+                         visualization_parameters=VisualizationParameters())
+    return env
+
+
+@pytest.fixture()
+def continuous_env():
+    # create a breakout gym environment
+    env = GymEnvironment(level='Pendulum-v0',
+                         seed=1,
+                         frame_skip=1,
+                         visualization_parameters=VisualizationParameters())
+    return env
+
+
+@pytest.mark.unit_test
+def test_gym_discrete_environment(atari_env):
+    # observation space
+    assert type(atari_env.state_space['observation']) == ImageObservationSpace
+    assert np.all(atari_env.state_space['observation'].shape == [210, 160, 3])
+    assert np.all(atari_env.last_env_response.next_state['observation'].shape == (210, 160, 3))
+
+    # action space
+    assert type(atari_env.action_space) == DiscreteActionSpace
+    assert np.all(atari_env.action_space.high == 3)
+
+    # make sure that the seed is working properly
+    assert np.sum(atari_env.last_env_response.next_state['observation']) == 4115856
+
+
+@pytest.mark.unit_test
+def test_gym_continuous_environment(continuous_env):
+    # observation space
+    assert type(continuous_env.state_space['observation']) == VectorObservationSpace
+    assert np.all(continuous_env.state_space['observation'].shape == [3])
+    assert np.all(continuous_env.last_env_response.next_state['observation'].shape == (3,))
+
+    # action space
+    assert type(continuous_env.action_space) == BoxActionSpace
+    assert np.all(continuous_env.action_space.shape == np.array([1]))
+
+    # make sure that the seed is working properly
+    assert np.sum(continuous_env.last_env_response.next_state['observation']) == 1.2661630859028832
+
+
+@pytest.mark.unit_test
+def test_step(atari_env):
+    result = atari_env.step(0)
+
+if __name__ == '__main__':
+    test_gym_continuous_environment(continuous_env())
\ No newline at end of file
diff --git a/rl_coach/tests/exploration_policies/__init__.py b/rl_coach/tests/exploration_policies/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/exploration_policies/test_additive_noise.py b/rl_coach/tests/exploration_policies/test_additive_noise.py
new file mode 100644
index 0000000..a32124b
--- /dev/null
+++ b/rl_coach/tests/exploration_policies/test_additive_noise.py
@@ -0,0 +1,44 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+
+from rl_coach.spaces import DiscreteActionSpace, BoxActionSpace
+from rl_coach.exploration_policies.additive_noise import AdditiveNoise
+from rl_coach.schedules import LinearSchedule
+import numpy as np
+
+
+@pytest.mark.unit_test
+def test_init():
+    # discrete control
+    action_space = DiscreteActionSpace(3)
+    noise_schedule = LinearSchedule(1.0, 1.0, 1000)
+
+    # additive noise doesn't work for discrete controls
+    with pytest.raises(ValueError):
+        policy = AdditiveNoise(action_space, noise_schedule, 0)
+
+    # additive noise requires a bounded range for the actions
+    action_space = BoxActionSpace(np.array([10]))
+    with pytest.raises(ValueError):
+        policy = AdditiveNoise(action_space, noise_schedule, 0)
+
+
+@pytest.mark.unit_test
+def test_get_action():
+    # make sure noise is in range
+    action_space = BoxActionSpace(np.array([10]), -1, 1)
+    noise_schedule = LinearSchedule(1.0, 1.0, 1000)
+    policy = AdditiveNoise(action_space, noise_schedule, 0)
+
+    # the action range is 2, so there is a ~0.1% chance that the noise will be larger than 3*std=3*2=6
+    for i in range(1000):
+        action = policy.get_action(np.zeros([10]))
+        assert np.all(action < 10)
+        # make sure there is no clipping of the action since it should be the environment that clips actions
+        assert np.all(action != 1.0)
+        assert np.all(action != -1.0)
+        # make sure that each action element has a different value
+        assert np.all(action[0] != action[1:])
diff --git a/rl_coach/tests/exploration_policies/test_e_greedy.py b/rl_coach/tests/exploration_policies/test_e_greedy.py
new file mode 100644
index 0000000..76d1f36
--- /dev/null
+++ b/rl_coach/tests/exploration_policies/test_e_greedy.py
@@ -0,0 +1,81 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+
+from rl_coach.spaces import DiscreteActionSpace
+from rl_coach.exploration_policies.e_greedy import EGreedy
+from rl_coach.schedules import LinearSchedule
+import numpy as np
+from rl_coach.core_types import RunPhase
+
+
+@pytest.mark.unit_test
+def test_get_action():
+    # discrete control
+    action_space = DiscreteActionSpace(3)
+    epsilon_schedule = LinearSchedule(1.0, 1.0, 1000)
+    policy = EGreedy(action_space, epsilon_schedule, evaluation_epsilon=0)
+
+    # verify that test phase gives greedy actions (evaluation_epsilon = 0)
+    policy.change_phase(RunPhase.TEST)
+    for i in range(100):
+        best_action = policy.get_action(np.array([10, 20, 30]))
+        assert best_action == 2
+
+    # verify that train phase gives uniform actions (exploration = 1)
+    policy.change_phase(RunPhase.TRAIN)
+    counters = np.array([0, 0, 0])
+    for i in range(30000):
+        best_action = policy.get_action(np.array([10, 20, 30]))
+        counters[best_action] += 1
+    assert np.all(counters > 9500)  # this is noisy so we allow 5% error
+
+    # TODO: test continuous actions
+
+
+@pytest.mark.unit_test
+def test_change_phase():
+    # discrete control
+    action_space = DiscreteActionSpace(3)
+    epsilon_schedule = LinearSchedule(1.0, 0.1, 1000)
+    policy = EGreedy(action_space, epsilon_schedule, evaluation_epsilon=0.01)
+
+    # verify schedule not applying if not in training phase
+    assert policy.get_control_param() == 1.0
+    policy.change_phase(RunPhase.TEST)
+    best_action = policy.get_action(np.array([10, 20, 30]))
+    assert policy.epsilon_schedule.current_value == 1.0
+    policy.change_phase(RunPhase.HEATUP)
+    best_action = policy.get_action(np.array([10, 20, 30]))
+    assert policy.epsilon_schedule.current_value == 1.0
+    policy.change_phase(RunPhase.UNDEFINED)
+    best_action = policy.get_action(np.array([10, 20, 30]))
+    assert policy.epsilon_schedule.current_value == 1.0
+
+
+@pytest.mark.unit_test
+def test_get_control_param():
+    # discrete control
+    action_space = DiscreteActionSpace(3)
+    epsilon_schedule = LinearSchedule(1.0, 0.1, 1000)
+    policy = EGreedy(action_space, epsilon_schedule, evaluation_epsilon=0.01)
+
+    # verify schedule applies to TRAIN phase
+    policy.change_phase(RunPhase.TRAIN)
+    for i in range(999):
+        best_action = policy.get_action(np.array([10, 20, 30]))
+        assert 1.0 > policy.get_control_param() > 0.1
+    best_action = policy.get_action(np.array([10, 20, 30]))
+    assert policy.get_control_param() == 0.1
+
+    # test phases
+    policy.change_phase(RunPhase.TEST)
+    assert policy.get_control_param() == 0.01
+
+    policy.change_phase(RunPhase.TRAIN)
+    assert policy.get_control_param() == 0.1
+
+    policy.change_phase(RunPhase.HEATUP)
+    assert policy.get_control_param() == 0.1
diff --git a/rl_coach/tests/exploration_policies/test_greedy.py b/rl_coach/tests/exploration_policies/test_greedy.py
new file mode 100644
index 0000000..ced5efb
--- /dev/null
+++ b/rl_coach/tests/exploration_policies/test_greedy.py
@@ -0,0 +1,34 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+
+from rl_coach.spaces import DiscreteActionSpace, BoxActionSpace
+from rl_coach.exploration_policies.greedy import Greedy
+import numpy as np
+
+
+@pytest.mark.unit_test
+def test_get_action():
+    # discrete control
+    action_space = DiscreteActionSpace(3)
+    policy = Greedy(action_space)
+
+    best_action = policy.get_action(np.array([10, 20, 30]))
+    assert best_action == 2
+
+    # continuous control
+    action_space = BoxActionSpace(np.array([10]))
+    policy = Greedy(action_space)
+
+    best_action = policy.get_action(np.array([1, 1, 1]))
+    assert np.all(best_action == np.array([1, 1, 1]))
+
+
+@pytest.mark.unit_test
+def test_get_control_param():
+    action_space = DiscreteActionSpace(3)
+    policy = Greedy(action_space)
+    assert policy.get_control_param() == 0
+
diff --git a/rl_coach/tests/exploration_policies/test_ou_process.py b/rl_coach/tests/exploration_policies/test_ou_process.py
new file mode 100644
index 0000000..2918e0c
--- /dev/null
+++ b/rl_coach/tests/exploration_policies/test_ou_process.py
@@ -0,0 +1,85 @@
+import os
+import sys
+
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+
+from rl_coach.spaces import DiscreteActionSpace, BoxActionSpace
+from rl_coach.exploration_policies.ou_process import OUProcess
+from rl_coach.core_types import RunPhase
+import numpy as np
+
+
+@pytest.mark.unit_test
+def test_init():
+    # discrete control
+    action_space = DiscreteActionSpace(3)
+
+    # OU process doesn't work for discrete controls
+    with pytest.raises(ValueError):
+        policy = OUProcess(action_space, mu=0, theta=0.1, sigma=0.2, dt=0.01)
+
+
+@pytest.mark.unit_test
+def test_get_action():
+    action_space = BoxActionSpace(np.array([10]), -1, 1)
+    policy = OUProcess(action_space, mu=0, theta=0.1, sigma=0.2, dt=0.01)
+
+    # make sure no noise is added in the testing phase
+    policy.change_phase(RunPhase.TEST)
+    assert np.all(policy.get_action(np.zeros((10,))) == np.zeros((10,)))
+    rand_action = np.random.rand(10)
+    assert np.all(policy.get_action(rand_action) == rand_action)
+
+    # make sure the noise added in the training phase matches the golden
+    policy.change_phase(RunPhase.TRAIN)
+    np.random.seed(0)
+    targets = [
+        [0.03528105, 0.00800314, 0.01957476, 0.04481786, 0.03735116, - 0.01954556, 0.01900177, - 0.00302714, - 0.00206438, 0.00821197],
+        [0.03812664, 0.03708061, 0.03477594, 0.04720655, 0.04619107, - 0.01285253, 0.04886435, - 0.00712728, 0.00419904, - 0.00887816],
+        [-0.01297129, 0.0501159, 0.05202989, 0.03231604, 0.09153997, - 0.04192699, 0.04973065, - 0.01086383, 0.03485043, 0.0205179],
+        [-0.00985937, 0.05762904, 0.03422214, - 0.00733221, 0.08449019, - 0.03875808, 0.07428674, 0.01319463, 0.02706904, 0.01445132],
+        [-3.08205658e-02, 2.91710492e-02, 6.25166679e-05, 3.16906342e-02, 7.42126579e-02, - 4.74808080e-02, 4.91565431e-02, 2.87312413e-02, - 5.23598615e-03, 1.01820670e-02],
+        [-0.04869908, 0.03687993, - 0.01015365, 0.0080463, 0.0735748, -0.03886669, 0.05043773, 0.03475195, - 0.01791719, 0.00291706],
+        [-0.06209959, 0.02965198, - 0.02640642, - 0.0264874, 0.07704975, - 0.04686344, 0.01778333, 0.04397284, - 0.03604524, 0.00395305],
+        [-0.04745568, 0.03220199, - 0.003592, -0.05115743, 0.08501953, - 0.06051278, 0.0003496, 0.03235188, - 0.04224025, 0.00507241],
+        [-0.07071122, 0.05018632, 0.00572484, - 0.08183114, 0.11469956, - 0.02253448, 0.02392484, 0.02872103, - 0.06361306, 0.02615637],
+        [-0.07870404, 0.07458503, 0.00988462, - 0.06221653, 0.12171218, - 0.00838049, 0.02411092, 0.06440972, - 0.0610112, 0.03417],
+        [-0.04096233, 0.04755527, - 0.01553497, - 0.04276638, 0.098128, 0.03050032, 0.01581443, 0.04939621, - 0.02249135, 0.06374613],
+        [-0.00357018, 0.06562861, - 0.03274395, - 0.00452232, 0.09266981, 0.04651895, 0.03474365, 0.04624661, - 0.01018727, 0.08212651],
+    ]
+    for i in range(10):
+        current_noise = policy.get_action(np.zeros((10,)))
+        assert np.all(np.abs(current_noise - targets[i]) < 1e-7)
+
+    # get some statistics. check very roughly that the mean acts according to the definition of the policy
+
+    # mean of 0
+    vals = []
+    for i in range(50000):
+        current_noise = policy.get_action(np.zeros((10,)))
+        vals.append(current_noise)
+    assert np.all(np.abs(np.mean(vals, axis=0)) < 1)
+
+    # mean of 10
+    policy = OUProcess(action_space, mu=10, theta=0.1, sigma=0.2, dt=0.01)
+    policy.change_phase(RunPhase.TRAIN)
+    vals = []
+    for i in range(50000):
+        current_noise = policy.get_action(np.zeros((10,)))
+        vals.append(current_noise)
+    assert np.all(np.abs(np.mean(vals, axis=0) - 10) < 1)
+
+    # plot the noise values - only used for understanding how the noise actually looks
+    # import matplotlib.pyplot as plt
+    # vals = np.array(vals)
+    # for i in range(10):
+    #     plt.plot(list(range(10000)), vals[:, i])
+    #     plt.plot(list(range(10000)), vals[:, i])
+    #     plt.plot(list(range(10000)), vals[:, i])
+    # plt.show()
+
+
+if __name__ == "__main__":
+    test_get_action()
diff --git a/rl_coach/tests/filters/__init__.py b/rl_coach/tests/filters/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/filters/action/__init__.py b/rl_coach/tests/filters/action/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/filters/action/test_attention_discretization.py b/rl_coach/tests/filters/action/test_attention_discretization.py
new file mode 100644
index 0000000..ed34b45
--- /dev/null
+++ b/rl_coach/tests/filters/action/test_attention_discretization.py
@@ -0,0 +1,44 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+from rl_coach.filters.action.attention_discretization import AttentionDiscretization
+from rl_coach.spaces import BoxActionSpace, DiscreteActionSpace, AttentionActionSpace
+import numpy as np
+
+
+@pytest.mark.unit_test
+def test_filter():
+    filter = AttentionDiscretization(2)
+
+    # passing an output space that is wrong
+    with pytest.raises(ValueError):
+        filter.validate_output_action_space(DiscreteActionSpace(10))
+    with pytest.raises(ValueError):
+        filter.validate_output_action_space(BoxActionSpace(10))
+
+    # 1 dimensional box
+    output_space = AttentionActionSpace(2, 0, 83)
+    input_space = filter.get_unfiltered_action_space(output_space)
+
+    assert np.all(filter.target_actions == np.array([[[0., 0.], [41.5, 41.5]],
+                                     [[0., 41.5], [41.5, 83.]],
+                                     [[41.5, 0], [83., 41.5]],
+                                     [[41.5, 41.5], [83., 83.]]]))
+    assert input_space.actions == list(range(4))
+
+    action = 2
+
+    result = filter.filter(action)
+    assert np.all(result == np.array([[41.5, 0], [83., 41.5]]))
+    assert output_space.val_matches_space_definition(result)
+
+    # force int bins
+    filter = AttentionDiscretization(2, force_int_bins=True)
+    input_space = filter.get_unfiltered_action_space(output_space)
+
+    assert np.all(filter.target_actions == np.array([[[0., 0.], [41, 41]],
+                                                     [[0., 41], [41, 83.]],
+                                                     [[41, 0], [83., 41]],
+                                                     [[41, 41], [83., 83.]]]))
diff --git a/rl_coach/tests/filters/action/test_box_discretization.py b/rl_coach/tests/filters/action/test_box_discretization.py
new file mode 100644
index 0000000..37aa96a
--- /dev/null
+++ b/rl_coach/tests/filters/action/test_box_discretization.py
@@ -0,0 +1,45 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+from rl_coach.filters.action.box_discretization import BoxDiscretization
+from rl_coach.spaces import BoxActionSpace, DiscreteActionSpace
+
+
+@pytest.mark.unit_test
+def test_filter():
+    filter = BoxDiscretization(9)
+
+    # passing an output space that is wrong
+    with pytest.raises(ValueError):
+        filter.validate_output_action_space(DiscreteActionSpace(10))
+
+    # 1 dimensional box
+    output_space = BoxActionSpace(1, 5, 15)
+    input_space = filter.get_unfiltered_action_space(output_space)
+
+    assert filter.target_actions == [[5.], [6.25], [7.5], [8.75], [10.], [11.25], [12.5], [13.75], [15.]]
+    assert input_space.actions == list(range(9))
+
+    action = 2
+
+    result = filter.filter(action)
+    assert result == [7.5]
+    assert output_space.val_matches_space_definition(result)
+
+    # 2 dimensional box
+    filter = BoxDiscretization(3)
+    output_space = BoxActionSpace(2, 5, 15)
+    input_space = filter.get_unfiltered_action_space(output_space)
+
+    assert filter.target_actions == [[5., 5.], [5., 10.], [5., 15.],
+                                     [10., 5.], [10., 10.], [10., 15.],
+                                     [15., 5.], [15., 10.], [15., 15.]]
+    assert input_space.actions == list(range(9))
+
+    action = 2
+
+    result = filter.filter(action)
+    assert result == [5., 15.]
+    assert output_space.val_matches_space_definition(result)
diff --git a/rl_coach/tests/filters/action/test_box_masking.py b/rl_coach/tests/filters/action/test_box_masking.py
new file mode 100644
index 0000000..83e10a4
--- /dev/null
+++ b/rl_coach/tests/filters/action/test_box_masking.py
@@ -0,0 +1,27 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+from rl_coach.filters.action.box_masking import BoxMasking
+from rl_coach.spaces import BoxActionSpace, DiscreteActionSpace
+import numpy as np
+
+
+@pytest.mark.unit_test
+def test_filter():
+    filter = BoxMasking(10, 20)
+
+    # passing an output space that is wrong
+    with pytest.raises(ValueError):
+        filter.validate_output_action_space(DiscreteActionSpace(10))
+
+    # 1 dimensional box
+    output_space = BoxActionSpace(1, 5, 30)
+    input_space = filter.get_unfiltered_action_space(output_space)
+
+    action = np.array([2])
+    result = filter.filter(action)
+    assert result == np.array([12])
+    assert output_space.val_matches_space_definition(result)
+
diff --git a/rl_coach/tests/filters/action/test_linear_box_to_box_map.py b/rl_coach/tests/filters/action/test_linear_box_to_box_map.py
new file mode 100644
index 0000000..bd1b65b
--- /dev/null
+++ b/rl_coach/tests/filters/action/test_linear_box_to_box_map.py
@@ -0,0 +1,29 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+from rl_coach.filters.action.linear_box_to_box_map import LinearBoxToBoxMap
+from rl_coach.spaces import BoxActionSpace, DiscreteActionSpace
+import numpy as np
+
+
+@pytest.mark.unit_test
+def test_filter():
+    filter = LinearBoxToBoxMap(10, 20)
+
+    # passing an output space that is wrong
+    with pytest.raises(ValueError):
+        filter.validate_output_action_space(DiscreteActionSpace(10))
+
+    # 1 dimensional box
+    output_space = BoxActionSpace(1, 5, 35)
+    input_space = filter.get_unfiltered_action_space(output_space)
+
+    action = np.array([2])
+
+    action = np.array([12])
+    result = filter.filter(action)
+    assert result == np.array([11])
+    assert output_space.val_matches_space_definition(result)
+
diff --git a/rl_coach/tests/filters/observation/__init__.py b/rl_coach/tests/filters/observation/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/filters/observation/test_observation_crop_filter.py b/rl_coach/tests/filters/observation/test_observation_crop_filter.py
new file mode 100644
index 0000000..83576e8
--- /dev/null
+++ b/rl_coach/tests/filters/observation/test_observation_crop_filter.py
@@ -0,0 +1,90 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.observation.observation_crop_filter import ObservationCropFilter
+from rl_coach.filters.filter import InputFilter
+from rl_coach.spaces import ObservationSpace
+from rl_coach.core_types import EnvResponse
+
+
+@pytest.fixture
+def env_response():
+    observation = np.random.rand(10, 20, 30)
+    return EnvResponse(next_state={'observation': observation}, reward=0, game_over=False)
+
+
+@pytest.mark.unit_test
+def test_filter(env_response):
+    crop_low = np.array([0, 5, 10])
+    crop_high = np.array([5, 10, 20])
+    crop_filter = InputFilter()
+    crop_filter.add_observation_filter('observation', 'crop', ObservationCropFilter(crop_low, crop_high))
+
+    result = crop_filter.filter(env_response)[0]
+    unfiltered_observation = env_response.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    # validate the shape of the filtered observation
+    assert filtered_observation.shape == (5, 5, 10)
+
+    # validate the content of the filtered observation
+    assert np.all(filtered_observation == unfiltered_observation[0:5, 5:10, 10:20])
+
+    # crop with -1 on some axes
+    crop_low = np.array([0, 0, 0])
+    crop_high = np.array([5, -1, -1])
+    crop_filter = InputFilter()
+    crop_filter.add_observation_filter('observation', 'crop', ObservationCropFilter(crop_low, crop_high))
+
+    result = crop_filter.filter(env_response)[0]
+    unfiltered_observation = env_response.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    # validate the shape of the filtered observation
+    assert filtered_observation.shape == (5, 20, 30)
+
+    # validate the content of the filtered observation
+    assert np.all(filtered_observation == unfiltered_observation[0:5, :, :])
+
+
+@pytest.mark.unit_test
+def test_get_filtered_observation_space():
+    crop_low = np.array([0, 5, 10])
+    crop_high = np.array([5, 10, 20])
+    crop_filter = InputFilter()
+    crop_filter.add_observation_filter('observation', 'crop', ObservationCropFilter(crop_low, crop_high))
+
+    observation_space = ObservationSpace(np.array([5, 10, 20]))
+    filtered_observation_space = crop_filter.get_filtered_observation_space('observation', observation_space)
+
+    # make sure the new observation space shape is calculated correctly
+    assert np.all(filtered_observation_space.shape == np.array([5, 5, 10]))
+
+    # make sure the original observation space is unchanged
+    assert np.all(observation_space.shape == np.array([5, 10, 20]))
+
+    # crop_high is bigger than the observation space
+    high_error_observation_space = ObservationSpace(np.array([3, 8, 14]))
+    with pytest.raises(ValueError):
+        crop_filter.get_filtered_observation_space('observation', high_error_observation_space)
+
+    # crop_low is bigger than the observation space
+    low_error_observation_space = ObservationSpace(np.array([3, 3, 10]))
+    with pytest.raises(ValueError):
+        crop_filter.get_filtered_observation_space('observation', low_error_observation_space)
+
+    # crop with -1 on some axes
+    crop_low = np.array([0, 0, 0])
+    crop_high = np.array([5, -1, -1])
+    crop_filter = InputFilter()
+    crop_filter.add_observation_filter('observation', 'crop', ObservationCropFilter(crop_low, crop_high))
+
+    observation_space = ObservationSpace(np.array([5, 10, 20]))
+    filtered_observation_space = crop_filter.get_filtered_observation_space('observation', observation_space)
+
+    # make sure the new observation space shape is calculated correctly
+    assert np.all(filtered_observation_space.shape == np.array([5, 10, 20]))
diff --git a/rl_coach/tests/filters/observation/test_observation_reduction_by_sub_parts_name_filter.py b/rl_coach/tests/filters/observation/test_observation_reduction_by_sub_parts_name_filter.py
new file mode 100644
index 0000000..da02487
--- /dev/null
+++ b/rl_coach/tests/filters/observation/test_observation_reduction_by_sub_parts_name_filter.py
@@ -0,0 +1,84 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.observation.observation_reduction_by_sub_parts_name_filter import ObservationReductionBySubPartsNameFilter
+from rl_coach.spaces import VectorObservationSpace
+from rl_coach.core_types import EnvResponse
+from rl_coach.filters.filter import InputFilter
+
+
+@pytest.mark.unit_test
+def test_filter():
+    # Keep
+    observation_space = VectorObservationSpace(3, measurements_names=['a', 'b', 'c'])
+    env_response = EnvResponse(next_state={'observation': np.ones([3])}, reward=0, game_over=False)
+    reduction_filter = InputFilter()
+    reduction_filter.add_observation_filter('observation', 'reduce',
+                                          ObservationReductionBySubPartsNameFilter(
+                                              ["a"],
+                                              ObservationReductionBySubPartsNameFilter.ReductionMethod.Keep
+                                          ))
+
+    reduction_filter.get_filtered_observation_space('observation', observation_space)
+    result = reduction_filter.filter(env_response)[0]
+    unfiltered_observation = env_response.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    # make sure the original observation is unchanged
+    assert unfiltered_observation.shape == (3,)
+
+    # validate the shape of the filtered observation
+    assert filtered_observation.shape == (1,)
+
+    # Discard
+    reduction_filter = InputFilter()
+    reduction_filter.add_observation_filter('observation', 'reduce',
+                                          ObservationReductionBySubPartsNameFilter(
+                                              ["a"],
+                                              ObservationReductionBySubPartsNameFilter.ReductionMethod.Discard
+                                          ))
+    reduction_filter.get_filtered_observation_space('observation', observation_space)
+    result = reduction_filter.filter(env_response)[0]
+    unfiltered_observation = env_response.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    # make sure the original observation is unchanged
+    assert unfiltered_observation.shape == (3,)
+
+    # validate the shape of the filtered observation
+    assert filtered_observation.shape == (2,)
+
+
+@pytest.mark.unit_test
+def test_get_filtered_observation_space():
+    # Keep
+    observation_space = VectorObservationSpace(3, measurements_names=['a', 'b', 'c'])
+    env_response = EnvResponse(next_state={'observation': np.ones([3])}, reward=0, game_over=False)
+    reduction_filter = InputFilter()
+    reduction_filter.add_observation_filter('observation', 'reduce',
+                                            ObservationReductionBySubPartsNameFilter(
+                                                ["a"],
+                                                ObservationReductionBySubPartsNameFilter.ReductionMethod.Keep
+                                            ))
+
+    filtered_observation_space = reduction_filter.get_filtered_observation_space('observation', observation_space)
+    assert np.all(filtered_observation_space.shape == np.array([1]))
+    assert filtered_observation_space.measurements_names == ['a']
+
+    # Discard
+    observation_space = VectorObservationSpace(3, measurements_names=['a', 'b', 'c'])
+    env_response = EnvResponse(next_state={'observation': np.ones([3])}, reward=0, game_over=False)
+    reduction_filter = InputFilter()
+    reduction_filter.add_observation_filter('observation', 'reduce',
+                                            ObservationReductionBySubPartsNameFilter(
+                                                ["a"],
+                                                ObservationReductionBySubPartsNameFilter.ReductionMethod.Discard
+                                            ))
+
+    filtered_observation_space = reduction_filter.get_filtered_observation_space('observation', observation_space)
+    assert np.all(filtered_observation_space.shape == np.array([2]))
+    assert filtered_observation_space.measurements_names == ['b', 'c']
diff --git a/rl_coach/tests/filters/observation/test_observation_rescale_size_by_factor_filter.py b/rl_coach/tests/filters/observation/test_observation_rescale_size_by_factor_filter.py
new file mode 100644
index 0000000..e5703d0
--- /dev/null
+++ b/rl_coach/tests/filters/observation/test_observation_rescale_size_by_factor_filter.py
@@ -0,0 +1,66 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.observation.observation_rescale_size_by_factor_filter import ObservationRescaleSizeByFactorFilter, RescaleInterpolationType
+from rl_coach.spaces import ObservationSpace
+from rl_coach.core_types import EnvResponse
+from rl_coach.filters.filter import InputFilter
+
+@pytest.mark.filterwarnings('ignore:Conversion of')
+@pytest.mark.unit_test
+def test_filter():
+    # make an RGB observation smaller
+    env_response = EnvResponse(next_state={'observation': np.ones([20, 30, 3])}, reward=0, game_over=False)
+    rescale_filter = InputFilter()
+    rescale_filter.add_observation_filter('observation', 'rescale',
+                                          ObservationRescaleSizeByFactorFilter(0.5, RescaleInterpolationType.BILINEAR))
+
+    result = rescale_filter.filter(env_response)[0]
+    unfiltered_observation = env_response.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    # make sure the original observation is unchanged
+    assert unfiltered_observation.shape == (20, 30, 3)
+
+    # validate the shape of the filtered observation
+    assert filtered_observation.shape == (10, 15, 3)
+
+    # make a grayscale observation bigger
+    env_response = EnvResponse(next_state={'observation': np.ones([20, 30])}, reward=0, game_over=False)
+    rescale_filter = InputFilter()
+    rescale_filter.add_observation_filter('observation', 'rescale',
+                                          ObservationRescaleSizeByFactorFilter(2, RescaleInterpolationType.BILINEAR))
+    result = rescale_filter.filter(env_response)[0]
+    filtered_observation = result.next_state['observation']
+
+    # validate the shape of the filtered observation
+    assert filtered_observation.shape == (40, 60)
+    assert np.all(filtered_observation == np.ones([40, 60]))
+
+
+@pytest.mark.unit_test
+def test_get_filtered_observation_space():
+    # error on wrong number of channels
+    rescale_filter = InputFilter()
+    rescale_filter.add_observation_filter('observation', 'rescale',
+                                          ObservationRescaleSizeByFactorFilter(0.5, RescaleInterpolationType.BILINEAR))
+    observation_space = ObservationSpace(np.array([10, 20, 5]))
+    with pytest.raises(ValueError):
+        filtered_observation_space = rescale_filter.get_filtered_observation_space('observation', observation_space)
+
+    # error on wrong number of dimensions
+    observation_space = ObservationSpace(np.array([10, 20, 10, 3]))
+    with pytest.raises(ValueError):
+        filtered_observation_space = rescale_filter.get_filtered_observation_space('observation', observation_space)
+
+    # make sure the new observation space shape is calculated correctly
+    observation_space = ObservationSpace(np.array([10, 20, 3]))
+    filtered_observation_space = rescale_filter.get_filtered_observation_space('observation', observation_space)
+    assert np.all(filtered_observation_space.shape == np.array([5, 10, 3]))
+
+    # make sure the original observation space is unchanged
+    assert np.all(observation_space.shape == np.array([10, 20, 3]))
diff --git a/rl_coach/tests/filters/observation/test_observation_rescale_to_size_filter.py b/rl_coach/tests/filters/observation/test_observation_rescale_to_size_filter.py
new file mode 100644
index 0000000..48b00e1
--- /dev/null
+++ b/rl_coach/tests/filters/observation/test_observation_rescale_to_size_filter.py
@@ -0,0 +1,106 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.observation.observation_rescale_to_size_filter import ObservationRescaleToSizeFilter, RescaleInterpolationType
+from rl_coach.spaces import ObservationSpace, ImageObservationSpace, PlanarMapsObservationSpace
+from rl_coach.core_types import EnvResponse
+from rl_coach.filters.filter import InputFilter
+
+
+@pytest.mark.filterwarnings('ignore:Conversion of')
+@pytest.mark.unit_test
+def test_filter():
+    # make an RGB observation smaller
+    transition = EnvResponse(next_state={'observation': np.ones([20, 30, 3])}, reward=0, game_over=False)
+    rescale_filter = InputFilter()
+    rescale_filter.add_observation_filter('observation', 'rescale',
+                                         ObservationRescaleToSizeFilter(ImageObservationSpace(np.array([10, 20, 3]),
+                                                                                              high=255),
+                                                    RescaleInterpolationType.BILINEAR))
+
+    result = rescale_filter.filter(transition)[0]
+    unfiltered_observation = transition.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    # make sure the original observation is unchanged
+    assert unfiltered_observation.shape == (20, 30, 3)
+
+    # validate the shape of the filtered observation
+    assert filtered_observation.shape == (10, 20, 3)
+    assert np.all(filtered_observation == np.ones([10, 20, 3]))
+
+    # make a grayscale observation bigger
+    transition = EnvResponse(next_state={'observation': np.ones([20, 30])}, reward=0, game_over=False)
+    rescale_filter = InputFilter()
+    rescale_filter.add_observation_filter('observation', 'rescale',
+                                         ObservationRescaleToSizeFilter(ImageObservationSpace(np.array([40, 60]),
+                                                                                              high=255),
+                                                    RescaleInterpolationType.BILINEAR))
+    result = rescale_filter.filter(transition)[0]
+    filtered_observation = result.next_state['observation']
+
+    # validate the shape of the filtered observation
+    assert filtered_observation.shape == (40, 60)
+    assert np.all(filtered_observation == np.ones([40, 60]))
+
+    # rescale channels -> error
+    # with pytest.raises(ValueError):
+    #     InputFilter(
+    #         observation_filters=OrderedDict([('rescale',
+    #                                          ObservationRescaleToSizeFilter(ImageObservationSpace(np.array([10, 20, 1]),
+    #                                                                                               high=255),
+    #                                                                         RescaleInterpolationType.BILINEAR))]))
+
+    # TODO: validate input to filter
+    # different number of axes -> error
+    # env_response = EnvResponse(state={'observation': np.ones([20, 30, 3])}, reward=0, game_over=False)
+    # rescale_filter = ObservationRescaleToSizeFilter(ObservationSpace(np.array([10, 20])),
+    #                                                 RescaleInterpolationType.BILINEAR)
+    # with pytest.raises(ValueError):
+    #     result = rescale_filter.filter(transition)
+
+    # channels first -> error
+    with pytest.raises(ValueError):
+        ObservationRescaleToSizeFilter(ImageObservationSpace(np.array([3, 10, 20]), high=255),
+                                       RescaleInterpolationType.BILINEAR)
+
+
+@pytest.mark.unit_test
+def test_get_filtered_observation_space():
+    # error on wrong number of channels
+    with pytest.raises(ValueError):
+        observation_filters = InputFilter()
+        observation_filters.add_observation_filter('observation', 'rescale',
+                                             ObservationRescaleToSizeFilter(ImageObservationSpace(np.array([5, 10, 5]),
+                                                                                                  high=255),
+                                                                            RescaleInterpolationType.BILINEAR))
+
+    # mismatch and wrong number of channels
+    rescale_filter = InputFilter()
+    rescale_filter.add_observation_filter('observation', 'rescale',
+                                         ObservationRescaleToSizeFilter(ImageObservationSpace(np.array([5, 10, 3]),
+                                                                                              high=255),
+                                                    RescaleInterpolationType.BILINEAR))
+
+    observation_space = PlanarMapsObservationSpace(np.array([10, 20, 5]), low=0, high=255)
+    with pytest.raises(ValueError):
+        rescale_filter.get_filtered_observation_space('observation', observation_space)
+
+    # error on wrong number of dimensions
+    observation_space = ObservationSpace(np.array([10, 20, 10, 3]), high=255)
+    with pytest.raises(ValueError):
+        rescale_filter.get_filtered_observation_space('observation', observation_space)
+
+    # make sure the new observation space shape is calculated correctly
+    observation_space = ImageObservationSpace(np.array([10, 20, 3]), high=255)
+    filtered_observation_space = rescale_filter.get_filtered_observation_space('observation', observation_space)
+    assert np.all(filtered_observation_space.shape == np.array([5, 10, 3]))
+
+    # make sure the original observation space is unchanged
+    assert np.all(observation_space.shape == np.array([10, 20, 3]))
+
+    # TODO: test that the type of the observation space stays the same
diff --git a/rl_coach/tests/filters/observation/test_observation_rgb_to_y_filter.py b/rl_coach/tests/filters/observation/test_observation_rgb_to_y_filter.py
new file mode 100644
index 0000000..ed9482e
--- /dev/null
+++ b/rl_coach/tests/filters/observation/test_observation_rgb_to_y_filter.py
@@ -0,0 +1,47 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.observation.observation_rgb_to_y_filter import ObservationRGBToYFilter
+from rl_coach.spaces import ObservationSpace
+from rl_coach.core_types import EnvResponse
+
+from rl_coach.filters.filter import InputFilter
+
+@pytest.fixture
+def rgb_to_y_filter():
+    rgb_to_y_filter = InputFilter()
+    rgb_to_y_filter.add_observation_filter('observation', 'rgb_to_y', ObservationRGBToYFilter())
+    return rgb_to_y_filter
+
+
+@pytest.mark.unit_test
+def test_filter(rgb_to_y_filter):
+    # convert RGB observation to graysacle
+    observation = np.random.rand(20, 30, 3)*255.0
+    transition = EnvResponse(next_state={'observation': observation}, reward=0, game_over=False)
+
+    result = rgb_to_y_filter.filter(transition)[0]
+    unfiltered_observation = transition.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    # make sure the original observation is unchanged
+    assert unfiltered_observation.shape == (20, 30, 3)
+
+    # make sure the filtering is done correctly
+    assert filtered_observation.shape == (20, 30)
+
+
+@pytest.mark.unit_test
+def test_get_filtered_observation_space(rgb_to_y_filter):
+    # error on observation space which are not RGB
+    observation_space = ObservationSpace(np.array([1, 2, 4]), 0, 100)
+    with pytest.raises(ValueError):
+        rgb_to_y_filter.get_filtered_observation_space('observation', observation_space)
+
+    observation_space = ObservationSpace(np.array([1, 2, 3]), 0, 100)
+    result = rgb_to_y_filter.get_filtered_observation_space('observation', observation_space)
+    assert np.all(result.shape == np.array([1, 2]))
diff --git a/rl_coach/tests/filters/observation/test_observation_squeeze_filter.py b/rl_coach/tests/filters/observation/test_observation_squeeze_filter.py
new file mode 100644
index 0000000..2bc6cd0
--- /dev/null
+++ b/rl_coach/tests/filters/observation/test_observation_squeeze_filter.py
@@ -0,0 +1,72 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.observation.observation_squeeze_filter import ObservationSqueezeFilter
+from rl_coach.spaces import ObservationSpace
+from rl_coach.core_types import EnvResponse
+from rl_coach.filters.filter import InputFilter
+
+
+@pytest.mark.unit_test
+def test_filter():
+    # make an RGB observation smaller
+    squeeze_filter = InputFilter()
+    squeeze_filter.add_observation_filter('observation', 'squeeze', ObservationSqueezeFilter())
+    squeeze_filter_with_axis = InputFilter()
+    squeeze_filter_with_axis.add_observation_filter('observation', 'squeeze', ObservationSqueezeFilter(2))
+
+    observation = np.random.rand(20, 30, 1, 3)
+    env_response = EnvResponse(next_state={'observation': observation}, reward=0, game_over=False)
+
+    result = squeeze_filter.filter(env_response)[0]
+    result_with_axis = squeeze_filter_with_axis.filter(env_response)[0]
+    unfiltered_observation_shape = env_response.next_state['observation'].shape
+    filtered_observation_shape = result.next_state['observation'].shape
+    filtered_observation_with_axis_shape = result_with_axis.next_state['observation'].shape
+
+    # make sure the original observation is unchanged
+    assert unfiltered_observation_shape == observation.shape
+
+    # make sure the filtering is done correctly
+    assert filtered_observation_shape == (20, 30, 3)
+    assert filtered_observation_with_axis_shape == (20, 30, 3)
+
+    observation = np.random.rand(1, 30, 1, 3)
+    env_response = EnvResponse(next_state={'observation': observation}, reward=0, game_over=False)
+
+    result = squeeze_filter.filter(env_response)[0]
+    assert result.next_state['observation'].shape == (30, 3)
+
+
+@pytest.mark.unit_test
+def test_get_filtered_observation_space():
+    # error on observation space with shape not matching the filter squeeze axis configuration
+    squeeze_filter = InputFilter()
+    squeeze_filter.add_observation_filter('observation', 'squeeze', ObservationSqueezeFilter(axis=3))
+
+    observation_space = ObservationSpace(np.array([20, 1, 30, 3]), 0, 100)
+    small_observation_space = ObservationSpace(np.array([20, 1, 30]), 0, 100)
+    with pytest.raises(ValueError):
+        squeeze_filter.get_filtered_observation_space('observation', observation_space)
+        squeeze_filter.get_filtered_observation_space('observation', small_observation_space)
+
+    # verify output observation space is correct
+    observation_space = ObservationSpace(np.array([1, 2, 3, 1]), 0, 200)
+    result = squeeze_filter.get_filtered_observation_space('observation', observation_space)
+    assert np.all(result.shape == np.array([1, 2, 3]))
+
+    squeeze_filter = InputFilter()
+    squeeze_filter.add_observation_filter('observation', 'squeeze', ObservationSqueezeFilter())
+
+    result = squeeze_filter.get_filtered_observation_space('observation', observation_space)
+    assert np.all(result.shape == np.array([2, 3]))
+
+
+if __name__ == '__main__':
+    test_filter()
+    test_get_filtered_observation_space()
+
diff --git a/rl_coach/tests/filters/observation/test_observation_stacking_filter.py b/rl_coach/tests/filters/observation/test_observation_stacking_filter.py
new file mode 100644
index 0000000..404cc92
--- /dev/null
+++ b/rl_coach/tests/filters/observation/test_observation_stacking_filter.py
@@ -0,0 +1,78 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.observation.observation_stacking_filter import ObservationStackingFilter
+from rl_coach.spaces import ObservationSpace
+from rl_coach.core_types import EnvResponse
+from rl_coach.filters.filter import InputFilter
+
+
+@pytest.fixture
+def env_response():
+    observation = np.random.rand(20, 30, 1)
+    return EnvResponse(next_state={'observation': observation}, reward=0, game_over=False)
+
+
+@pytest.fixture
+def stack_filter():
+    stack_filter = InputFilter()
+    stack_filter.add_observation_filter('observation', 'stack', ObservationStackingFilter(4, stacking_axis=-1))
+    return stack_filter
+
+
+@pytest.mark.unit_test
+def test_filter(stack_filter, env_response):
+    # stack observation on empty stack
+    result = stack_filter.filter(env_response)[0]
+    unfiltered_observation = env_response.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    # validate that the shape of the unfiltered observation is unchanged
+    assert unfiltered_observation.shape == (20, 30, 1)
+    assert np.array(filtered_observation).shape == (20, 30, 1, 4)
+    assert np.all(np.array(filtered_observation)[:, :, :, -1] == unfiltered_observation)
+
+    # stack observation on non-empty stack
+    result = stack_filter.filter(env_response)[0]
+    filtered_observation = result.next_state['observation']
+    assert np.array(filtered_observation).shape == (20, 30, 1, 4)
+
+
+@pytest.mark.unit_test
+def test_get_filtered_observation_space(stack_filter, env_response):
+    observation_space = ObservationSpace(np.array([5, 10, 20]))
+    filtered_observation_space = stack_filter.get_filtered_observation_space('observation', observation_space)
+
+    # make sure the new observation space shape is calculated correctly
+    assert np.all(filtered_observation_space.shape == np.array([5, 10, 20, 4]))
+
+    # make sure the original observation space is unchanged
+    assert np.all(observation_space.shape == np.array([5, 10, 20]))
+
+    # call after stack is already created with non-matching shape -> error
+    result = stack_filter.filter(env_response)[0]
+    with pytest.raises(ValueError):
+        filtered_observation_space = stack_filter.get_filtered_observation_space('observation', observation_space)
+
+
+@pytest.mark.unit_test
+def test_reset(stack_filter, env_response):
+    # stack observation on empty stack
+    result = stack_filter.filter(env_response)[0]
+    unfiltered_observation = env_response.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    assert np.all(np.array(filtered_observation)[:, :, :, -1] == unfiltered_observation)
+
+    # reset and make sure the outputs are correct
+    stack_filter.reset()
+    unfiltered_observation = np.random.rand(20, 30, 1)
+    new_env_response = EnvResponse(next_state={'observation': unfiltered_observation}, reward=0, game_over=False)
+    result = stack_filter.filter(new_env_response)[0]
+    filtered_observation = result.next_state['observation']
+    assert np.all(np.array(filtered_observation)[:, :, :, 0] == unfiltered_observation)
+    assert np.all(np.array(filtered_observation)[:, :, :, -1] == unfiltered_observation)
diff --git a/rl_coach/tests/filters/observation/test_observation_to_uint8_filter.py b/rl_coach/tests/filters/observation/test_observation_to_uint8_filter.py
new file mode 100644
index 0000000..bee2333
--- /dev/null
+++ b/rl_coach/tests/filters/observation/test_observation_to_uint8_filter.py
@@ -0,0 +1,50 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.observation.observation_to_uint8_filter import ObservationToUInt8Filter
+from rl_coach.spaces import ObservationSpace
+from rl_coach.core_types import EnvResponse
+from rl_coach.filters.filter import InputFilter
+
+
+@pytest.mark.unit_test
+def test_filter():
+    # make an RGB observation smaller
+    uint8_filter = InputFilter()
+    uint8_filter.add_observation_filter('observation', 'to_uint8', ObservationToUInt8Filter(input_low=0, input_high=255))
+
+    observation = np.random.rand(20, 30, 3)*255.0
+    env_response = EnvResponse(next_state={'observation': observation}, reward=0, game_over=False)
+
+    result = uint8_filter.filter(env_response)[0]
+    unfiltered_observation = env_response.next_state['observation']
+    filtered_observation = result.next_state['observation']
+
+    # make sure the original observation is unchanged
+    assert unfiltered_observation.dtype == 'float64'
+
+    # make sure the filtering is done correctly
+    assert filtered_observation.dtype == 'uint8'
+    assert np.all(filtered_observation == observation.astype('uint8'))
+
+
+@pytest.mark.unit_test
+def test_get_filtered_observation_space():
+    # error on observation space with values not matching the filter configuration
+    uint8_filter = InputFilter()
+    uint8_filter.add_observation_filter('observation', 'to_uint8', ObservationToUInt8Filter(input_low=0, input_high=200))
+
+    observation_space = ObservationSpace(np.array([1, 2, 3]), 0, 100)
+    with pytest.raises(ValueError):
+        uint8_filter.get_filtered_observation_space('observation', observation_space)
+
+    # verify output observation space is correct
+    observation_space = ObservationSpace(np.array([1, 2, 3]), 0, 200)
+    result = uint8_filter.get_filtered_observation_space('observation', observation_space)
+    assert np.all(result.high == 255)
+    assert np.all(result.low == 0)
+    assert np.all(result.shape == observation_space.shape)
diff --git a/rl_coach/tests/filters/reward/__init__.py b/rl_coach/tests/filters/reward/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/filters/reward/test_reward_clipping_filter.py b/rl_coach/tests/filters/reward/test_reward_clipping_filter.py
new file mode 100644
index 0000000..7e004c1
--- /dev/null
+++ b/rl_coach/tests/filters/reward/test_reward_clipping_filter.py
@@ -0,0 +1,74 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.reward.reward_clipping_filter import RewardClippingFilter
+from rl_coach.spaces import RewardSpace
+from rl_coach.core_types import EnvResponse
+
+from collections import OrderedDict
+from rl_coach.filters.filter import InputFilter
+
+
+@pytest.fixture
+def clip_filter():
+    return InputFilter(reward_filters=OrderedDict([('clip', RewardClippingFilter(2, 10))]))
+
+
+@pytest.mark.unit_test
+def test_filter(clip_filter):
+    transition = EnvResponse(next_state={'observation': np.zeros(10)}, reward=100, game_over=False)
+    result = clip_filter.filter(transition)[0]
+    unfiltered_reward = transition.reward
+    filtered_reward = result.reward
+
+    # validate that the reward was clipped correctly
+    assert filtered_reward == 10
+
+    # make sure the original reward is unchanged
+    assert unfiltered_reward == 100
+
+    # reward in bounds
+    transition = EnvResponse(next_state={'observation': np.zeros(10)}, reward=5, game_over=False)
+    result = clip_filter.filter(transition)[0]
+    assert result.reward == 5
+
+    # reward below bounds
+    transition = EnvResponse(next_state={'observation': np.zeros(10)}, reward=-5, game_over=False)
+    result = clip_filter.filter(transition)[0]
+    assert result.reward == 2
+
+
+@pytest.mark.unit_test
+def test_get_filtered_reward_space(clip_filter):
+    # reward is clipped
+    reward_space = RewardSpace(1, -100, 100)
+    filtered_reward_space = clip_filter.get_filtered_reward_space(reward_space)
+
+    # make sure the new reward space shape is calculated correctly
+    assert filtered_reward_space.shape == 1
+    assert filtered_reward_space.low == 2
+    assert filtered_reward_space.high == 10
+
+    # reward is unclipped
+    reward_space = RewardSpace(1, 5, 7)
+    filtered_reward_space = clip_filter.get_filtered_reward_space(reward_space)
+
+    # make sure the new reward space shape is calculated correctly
+    assert filtered_reward_space.shape == 1
+    assert filtered_reward_space.low == 5
+    assert filtered_reward_space.high == 7
+
+    # infinite reward is clipped
+    reward_space = RewardSpace(1, -np.inf, np.inf)
+    filtered_reward_space = clip_filter.get_filtered_reward_space(reward_space)
+
+    # make sure the new reward space shape is calculated correctly
+    assert filtered_reward_space.shape == 1
+    assert filtered_reward_space.low == 2
+    assert filtered_reward_space.high == 10
+
+
diff --git a/rl_coach/tests/filters/reward/test_reward_rescale_filter.py b/rl_coach/tests/filters/reward/test_reward_rescale_filter.py
new file mode 100644
index 0000000..87fada4
--- /dev/null
+++ b/rl_coach/tests/filters/reward/test_reward_rescale_filter.py
@@ -0,0 +1,56 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter
+from rl_coach.spaces import RewardSpace
+from rl_coach.core_types import EnvResponse
+from rl_coach.filters.filter import InputFilter
+from collections import OrderedDict
+
+
+@pytest.mark.unit_test
+def test_filter():
+    rescale_filter = InputFilter(reward_filters=OrderedDict([('rescale', RewardRescaleFilter(1/10.))]))
+    env_response = EnvResponse(next_state={'observation': np.zeros(10)}, reward=100, game_over=False)
+    print(rescale_filter.observation_filters)
+    result = rescale_filter.filter(env_response)[0]
+    unfiltered_reward = env_response.reward
+    filtered_reward = result.reward
+
+    # validate that the reward was clipped correctly
+    assert filtered_reward == 10
+
+    # make sure the original reward is unchanged
+    assert unfiltered_reward == 100
+
+    # negative reward
+    env_response = EnvResponse(next_state={'observation': np.zeros(10)}, reward=-50, game_over=False)
+    result = rescale_filter.filter(env_response)[0]
+    assert result.reward == -5
+
+
+@pytest.mark.unit_test
+def test_get_filtered_reward_space():
+    rescale_filter = InputFilter(reward_filters=OrderedDict([('rescale', RewardRescaleFilter(1/10.))]))
+
+    # reward is clipped
+    reward_space = RewardSpace(1, -100, 100)
+    filtered_reward_space = rescale_filter.get_filtered_reward_space(reward_space)
+
+    # make sure the new reward space shape is calculated correctly
+    assert filtered_reward_space.shape == 1
+    assert filtered_reward_space.low == -10
+    assert filtered_reward_space.high == 10
+
+    # unbounded rewards
+    reward_space = RewardSpace(1, -np.inf, np.inf)
+    filtered_reward_space = rescale_filter.get_filtered_reward_space(reward_space)
+
+    # make sure the new reward space shape is calculated correctly
+    assert filtered_reward_space.shape == 1
+    assert filtered_reward_space.low == -np.inf
+    assert filtered_reward_space.high == np.inf
diff --git a/rl_coach/tests/filters/test_filters_stacking.py b/rl_coach/tests/filters/test_filters_stacking.py
new file mode 100644
index 0000000..3534243
--- /dev/null
+++ b/rl_coach/tests/filters/test_filters_stacking.py
@@ -0,0 +1,70 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+
+from rl_coach.filters.observation.observation_rescale_to_size_filter import ObservationRescaleToSizeFilter, RescaleInterpolationType
+from rl_coach.filters.observation.observation_crop_filter import ObservationCropFilter
+from rl_coach.filters.reward.reward_clipping_filter import RewardClippingFilter
+from rl_coach.filters.observation.observation_stacking_filter import ObservationStackingFilter
+from rl_coach.filters.filter import InputFilter
+from rl_coach.spaces import ImageObservationSpace
+import numpy as np
+from rl_coach.core_types import EnvResponse
+from collections import OrderedDict
+
+
+@pytest.mark.filterwarnings('ignore:Conversion of')
+@pytest.mark.unit_test
+def test_filter_stacking():
+    # test that filter stacking works fine by taking as input a transition with:
+    # - an observation of shape 210x160,
+    # - a reward of 100
+    # filtering it by:
+    # - rescaling the observation to 110x84
+    # - cropping the observation to 84x84
+    # - clipping the reward to 1
+    # - stacking 4 observations to get 84x84x4
+
+    env_response = EnvResponse({'observation': np.ones([210, 160])}, reward=100, game_over=False)
+
+    filter1 = ObservationRescaleToSizeFilter(
+        output_observation_space=ImageObservationSpace(np.array([110, 84]), high=255),
+        rescaling_interpolation_type=RescaleInterpolationType.BILINEAR
+    )
+
+    filter2 = ObservationCropFilter(
+        crop_low=np.array([16, 0]),
+        crop_high=np.array([100, 84])
+    )
+
+    filter3 = RewardClippingFilter(
+        clipping_low=-1,
+        clipping_high=1
+    )
+
+    output_filter = ObservationStackingFilter(
+        stack_size=4,
+        stacking_axis=-1
+    )
+
+    input_filter = InputFilter(
+        observation_filters={
+            "observation": OrderedDict([
+                ("filter1", filter1),
+                ("filter2", filter2),
+                ("output_filter", output_filter)
+            ])},
+        reward_filters=OrderedDict([
+            ("filter3", filter3)
+        ])
+    )
+
+    result = input_filter.filter(env_response)[0]
+    observation = np.array(result.next_state['observation'])
+    assert observation.shape == (84, 84, 4)
+    assert np.all(observation == np.ones([84, 84, 4]))
+    assert result.reward == 1
+
+
diff --git a/rl_coach/tests/golden_tests.py b/rl_coach/tests/golden_tests.py
new file mode 100644
index 0000000..ba24551
--- /dev/null
+++ b/rl_coach/tests/golden_tests.py
@@ -0,0 +1,355 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import argparse
+import glob
+import os
+import shutil
+import signal
+import subprocess
+import sys
+from importlib import import_module
+from os import path
+sys.path.append('.')
+import numpy as np
+import pandas as pd
+import time
+
+# -*- coding: utf-8 -*-
+from rl_coach.logger import screen
+
+
+def read_csv_paths(test_path, filename_pattern, read_csv_tries=50):
+    csv_paths = []
+    tries_counter = 0
+    while not csv_paths:
+        csv_paths = glob.glob(path.join(test_path, '*', filename_pattern))
+        if tries_counter > read_csv_tries:
+            break
+        tries_counter += 1
+        time.sleep(1)
+    return csv_paths
+
+
+def clean_df(df):
+    if 'Wall-Clock Time' in df.keys():
+        df.drop(['Wall-Clock Time'], 1, inplace=True)
+    return df
+
+
+def print_progress(averaged_rewards, last_num_episodes, preset_validation_params, start_time, args):
+    percentage = int((100 * last_num_episodes) / preset_validation_params.max_episodes_to_achieve_reward)
+    sys.stdout.write("\rReward: ({}/{})".format(round(averaged_rewards[-1], 1),
+                                                preset_validation_params.min_reward_threshold))
+    sys.stdout.write(' Time (sec): ({}/{})'.format(round(time.time() - start_time, 2), args.time_limit))
+    sys.stdout.write(' Episode: ({}/{})'.format(last_num_episodes,
+                                                preset_validation_params.max_episodes_to_achieve_reward))
+    sys.stdout.write(
+        ' {}%|{}{}|  '.format(percentage, '#' * int(percentage / 10), ' ' * (10 - int(percentage / 10))))
+    sys.stdout.flush()
+
+
+def perform_reward_based_tests(args, preset_validation_params, preset_name):
+    win_size = 10
+
+    test_name = '__test_reward'
+    test_path = os.path.join('./experiments', test_name)
+    if path.exists(test_path):
+        shutil.rmtree(test_path)
+
+    # run the experiment in a separate thread
+    screen.log_title("Running test {}".format(preset_name))
+    log_file_name = 'test_log_{preset_name}.txt'.format(preset_name=preset_name)
+    cmd = (
+        'python3 rl_coach/coach.py '
+        '-p {preset_name} '
+        '-e {test_name} '
+        '-n {num_workers} '
+        '--seed 0 '
+        '-c '
+        '{level} '
+        '&> {log_file_name} '
+    ).format(
+        preset_name=preset_name,
+        test_name=test_name,
+        num_workers=preset_validation_params.num_workers,
+        log_file_name=log_file_name,
+        level='-lvl ' + preset_validation_params.reward_test_level if preset_validation_params.reward_test_level else ''
+    )
+
+    p = subprocess.Popen(cmd, shell=True, executable="/bin/bash", preexec_fn=os.setsid)
+
+    start_time = time.time()
+
+    reward_str = 'Evaluation Reward'
+    if preset_validation_params.num_workers > 1:
+        filename_pattern = 'worker_0*.csv'
+    else:
+        filename_pattern = '*.csv'
+
+    test_passed = False
+
+    # get the csv with the results
+    csv_paths = read_csv_paths(test_path, filename_pattern)
+
+    if csv_paths:
+        csv_path = csv_paths[0]
+
+        # verify results
+        csv = None
+        time.sleep(1)
+        averaged_rewards = [0]
+
+        last_num_episodes = 0
+
+        if not args.no_progress_bar:
+            print_progress(averaged_rewards, last_num_episodes, preset_validation_params, start_time, args)
+
+        while csv is None or (csv['Episode #'].values[
+                                  -1] < preset_validation_params.max_episodes_to_achieve_reward and time.time() - start_time < args.time_limit):
+            try:
+                csv = pd.read_csv(csv_path)
+            except:
+                # sometimes the csv is being written at the same time we are
+                # trying to read it. no problem -> try again
+                continue
+
+            if reward_str not in csv.keys():
+                continue
+
+            rewards = csv[reward_str].values
+            rewards = rewards[~np.isnan(rewards)]
+
+            if len(rewards) >= 1:
+                averaged_rewards = np.convolve(rewards, np.ones(min(len(rewards), win_size)) / win_size, mode='valid')
+            else:
+                time.sleep(1)
+                continue
+
+            if not args.no_progress_bar:
+                print_progress(averaged_rewards, last_num_episodes, preset_validation_params, start_time, args)
+
+            if csv['Episode #'].shape[0] - last_num_episodes <= 0:
+                continue
+
+            last_num_episodes = csv['Episode #'].values[-1]
+
+            # check if reward is enough
+            if np.any(averaged_rewards >= preset_validation_params.min_reward_threshold):
+                test_passed = True
+                break
+            time.sleep(1)
+
+    # kill test and print result
+    os.killpg(os.getpgid(p.pid), signal.SIGTERM)
+    screen.log('')
+    if test_passed:
+        screen.success("Passed successfully")
+    else:
+        if time.time() - start_time > args.time_limit:
+            screen.error("Failed due to exceeding time limit", crash=False)
+            if args.verbose:
+                screen.error("command exitcode: {}".format(p.returncode), crash=False)
+                screen.error(open(log_file_name).read(), crash=False)
+        elif csv_paths:
+            screen.error("Failed due to insufficient reward", crash=False)
+            if args.verbose:
+                screen.error("command exitcode: {}".format(p.returncode), crash=False)
+                screen.error(open(log_file_name).read(), crash=False)
+            screen.error("preset_validation_params.max_episodes_to_achieve_reward: {}".format(
+                preset_validation_params.max_episodes_to_achieve_reward), crash=False)
+            screen.error("preset_validation_params.min_reward_threshold: {}".format(
+                preset_validation_params.min_reward_threshold), crash=False)
+            screen.error("averaged_rewards: {}".format(averaged_rewards), crash=False)
+            screen.error("episode number: {}".format(csv['Episode #'].values[-1]), crash=False)
+        else:
+            screen.error("csv file never found", crash=False)
+            if args.verbose:
+                screen.error("command exitcode: {}".format(p.returncode), crash=False)
+                screen.error(open(log_file_name).read(), crash=False)
+
+    shutil.rmtree(test_path)
+    os.remove(log_file_name)
+    return test_passed
+
+
+def perform_trace_based_tests(args, preset_name, num_env_steps, level=None):
+    test_name = '__test_trace'
+    test_path = os.path.join('./experiments', test_name)
+    if path.exists(test_path):
+        shutil.rmtree(test_path)
+
+    # run the experiment in a separate thread
+    screen.log_title("Running test {}{}".format(preset_name, ' - ' + level if level else ''))
+    log_file_name = 'test_log_{preset_name}.txt'.format(preset_name=preset_name)
+
+    cmd = (
+        'python3 rl_coach/coach.py '
+        '-p {preset_name} ' 
+        '-e {test_name} '
+        '--seed 42 '
+        '-c '
+        '--no_summary '
+        '-cp {custom_param} '
+        '{level} '
+        '&> {log_file_name} '
+    ).format(
+        preset_name=preset_name,
+        test_name=test_name,
+        log_file_name=log_file_name,
+        level='-lvl ' + level if level else '',
+        custom_param='\"improve_steps=EnvironmentSteps({n});'
+                     'steps_between_evaluation_periods=EnvironmentSteps({n});'
+                     'evaluation_steps=EnvironmentSteps(1);'
+                     'heatup_steps=EnvironmentSteps(1024)\"'.format(n=num_env_steps)
+    )
+
+    p = subprocess.Popen(cmd, shell=True, executable="/bin/bash", preexec_fn=os.setsid)
+    p.wait()
+
+    filename_pattern = '*.csv'
+
+    # get the csv with the results
+    csv_paths = read_csv_paths(test_path, filename_pattern)
+
+    test_passed = False
+    if not csv_paths:
+        screen.error("csv file never found", crash=False)
+        if args.verbose:
+            screen.error("command exitcode: {}".format(p.returncode), crash=False)
+            screen.error(open(log_file_name).read(), crash=False)
+    else:
+        trace_path = os.path.join('./rl_coach', 'traces', preset_name + '_' + level if level else preset_name, '')
+        if not os.path.exists(trace_path):
+            screen.log('No trace found, creating new trace in: {}'.format(trace_path))
+            os.makedirs(os.path.dirname(trace_path))
+            df = pd.read_csv(csv_paths[0])
+            df = clean_df(df)
+            df.to_csv(os.path.join(trace_path, 'trace.csv'), index=False)
+            screen.success("Successfully created new trace.")
+            test_passed = True
+        else:
+            test_df = pd.read_csv(csv_paths[0])
+            test_df = clean_df(test_df)
+            new_trace_csv_path = os.path.join(trace_path, 'trace_new.csv')
+            test_df.to_csv(new_trace_csv_path, index=False)
+            test_df = pd.read_csv(new_trace_csv_path)
+            trace_csv_path = glob.glob(path.join(trace_path, 'trace.csv'))
+            trace_csv_path = trace_csv_path[0]
+            trace_df = pd.read_csv(trace_csv_path)
+            test_passed = test_df.equals(trace_df)
+            if test_passed:
+                screen.success("Passed successfully.")
+                os.remove(new_trace_csv_path)
+            else:
+                screen.error("Trace test failed.", crash=False)
+                if args.overwrite:
+                    os.remove(trace_csv_path)
+                    os.rename(new_trace_csv_path, trace_csv_path)
+                    screen.error("Overwriting old trace.", crash=False)
+                else:
+                    screen.error("bcompare {} {}".format(trace_csv_path, new_trace_csv_path), crash=False)
+
+    shutil.rmtree(test_path)
+    os.remove(log_file_name)
+    return test_passed
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('-t', '--trace',
+                        help="(flag) perform trace based testing",
+                        action='store_true')
+    parser.add_argument('-p', '--preset',
+                        help="(string) Name of a preset to run (as configured in presets.py)",
+                        default=None,
+                        type=str)
+    parser.add_argument('-ip', '--ignore_presets',
+                        help="(string) Name of a preset(s) to ignore (comma separated, and as configured in presets.py)",
+                        default=None,
+                        type=str)
+    parser.add_argument('-v', '--verbose',
+                        help="(flag) display verbose logs in the event of an error",
+                        action='store_true')
+    parser.add_argument('--stop_after_first_failure',
+                        help="(flag) stop executing tests after the first error",
+                        action='store_true')
+    parser.add_argument('-tl', '--time_limit',
+                        help="time limit for each test in minutes",
+                        default=40,  # setting time limit to be so high due to DDPG being very slow - its tests are long
+                        type=int)
+    parser.add_argument('-np', '--no_progress_bar',
+                        help="(flag) Don't print the progress bar (makes jenkins logs more readable)",
+                        action='store_true')
+    parser.add_argument('-ow', '--overwrite',
+                        help="(flag) overwrite old trace with new ones in trace testing mode",
+                        action='store_true')
+
+    args = parser.parse_args()
+    if args.preset is not None:
+        presets_lists = [args.preset]
+    else:
+        # presets_lists = list_all_classes_in_module(presets)
+        presets_lists = [f[:-3] for f in os.listdir(os.path.join('rl_coach', 'presets')) if
+                         f[-3:] == '.py' and not f == '__init__.py']
+
+    fail_count = 0
+    test_count = 0
+
+    args.time_limit = 60 * args.time_limit
+
+    if args.ignore_presets is not None:
+        presets_to_ignore = args.ignore_presets.split(',')
+    else:
+        presets_to_ignore = []
+    for idx, preset_name in enumerate(sorted(presets_lists)):
+        if args.stop_after_first_failure and fail_count > 0:
+            break
+        if preset_name not in presets_to_ignore:
+            try:
+                preset = import_module('rl_coach.presets.{}'.format(preset_name))
+            except:
+                if args.verbose:
+                    screen.error("Failed to load preset <{}>".format(preset_name), crash=False)
+                continue
+
+            preset_validation_params = preset.graph_manager.preset_validation_params
+            if not args.trace and not preset_validation_params.test:
+                continue
+
+            if args.trace:
+                num_env_steps = preset_validation_params.trace_max_env_steps
+                if preset_validation_params.trace_test_levels:
+                    for level in preset_validation_params.trace_test_levels:
+                        test_count += 1
+                        test_passed = perform_trace_based_tests(args, preset_name, num_env_steps, level)
+                else:
+                    test_count += 1
+                    test_passed = perform_trace_based_tests(args, preset_name, num_env_steps)
+            else:
+                test_passed = perform_reward_based_tests(args, preset_validation_params, preset_name)
+            if not test_passed:
+                fail_count += 1
+
+    screen.separator()
+    if fail_count == 0:
+        screen.success(" Summary: " + str(test_count) + "/" + str(test_count) + " tests passed successfully")
+    else:
+        screen.error(" Summary: " + str(test_count - fail_count) + "/" + str(test_count) + " tests passed successfully")
+
+
+if __name__ == '__main__':
+    main()
diff --git a/rl_coach/tests/graph_managers/__init__.py b/rl_coach/tests/graph_managers/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/graph_managers/test_basic_rl_graph_manager.py b/rl_coach/tests/graph_managers/test_basic_rl_graph_manager.py
new file mode 100644
index 0000000..e489373
--- /dev/null
+++ b/rl_coach/tests/graph_managers/test_basic_rl_graph_manager.py
@@ -0,0 +1,52 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+import tensorflow as tf
+from rl_coach.base_parameters import TaskParameters, DistributedTaskParameters
+from rl_coach.utils import get_open_port
+from multiprocessing import Process
+from tensorflow import logging
+import pytest
+logging.set_verbosity(logging.INFO)
+
+
+@pytest.mark.unit_test
+def test_basic_rl_graph_manager_with_pong_a3c():
+    tf.reset_default_graph()
+    from rl_coach.presets.Atari_A3C import graph_manager
+    assert graph_manager
+    graph_manager.env_params.level = "PongDeterministic-v4"
+    graph_manager.create_graph(task_parameters=TaskParameters(framework_type="tensorflow",
+                                                              experiment_path="./experiments/test"))
+    # graph_manager.improve()
+
+
+@pytest.mark.unit_test
+def test_basic_rl_graph_manager_with_pong_nec():
+    tf.reset_default_graph()
+    from rl_coach.presets.Atari_NEC import graph_manager
+    assert graph_manager
+    graph_manager.env_params.level = "PongDeterministic-v4"
+    graph_manager.create_graph(task_parameters=TaskParameters(framework_type="tensorflow",
+                                                              experiment_path="./experiments/test"))
+    # graph_manager.improve()
+
+
+@pytest.mark.unit_test
+def test_basic_rl_graph_manager_with_cartpole_dqn():
+    tf.reset_default_graph()
+    from rl_coach.presets.CartPole_DQN import graph_manager
+    assert graph_manager
+    graph_manager.create_graph(task_parameters=TaskParameters(framework_type="tensorflow",
+                                                              experiment_path="./experiments/test"))
+    # graph_manager.improve()
+
+
+if __name__ == '__main__':
+    pass
+    # test_basic_rl_graph_manager_with_pong_a3c()
+    # test_basic_rl_graph_manager_with_ant_a3c()
+    # test_basic_rl_graph_manager_with_pong_nec()
+	# test_basic_rl_graph_manager_with_cartpole_dqn()
+    #test_basic_rl_graph_manager_multithreaded_with_pong_a3c()
+	#test_basic_rl_graph_manager_with_doom_basic_dqn()
\ No newline at end of file
diff --git a/rl_coach/tests/memories/__init__.py b/rl_coach/tests/memories/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/memories/test_differential_neural_dictionary.py b/rl_coach/tests/memories/test_differential_neural_dictionary.py
new file mode 100644
index 0000000..461b4e5
--- /dev/null
+++ b/rl_coach/tests/memories/test_differential_neural_dictionary.py
@@ -0,0 +1,91 @@
+# nasty hack to deal with issue #46
+import os
+import sys
+
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+import time
+from rl_coach.memories.non_episodic.differentiable_neural_dictionary import QDND
+import tensorflow as tf
+
+NUM_ACTIONS = 3
+NUM_DND_ENTRIES_TO_ADD = 10000
+EMBEDDING_SIZE = 512
+NUM_SAMPLED_EMBEDDINGS = 500
+NUM_NEIGHBORS = 10
+DND_SIZE = 500000
+
+@pytest.fixture()
+def dnd():
+    return QDND(
+                DND_SIZE,
+                EMBEDDING_SIZE,
+                NUM_ACTIONS,
+                0.1,
+                key_error_threshold=0,
+                learning_rate=0.0001,
+                num_neighbors=NUM_NEIGHBORS
+                )
+
+
+@pytest.mark.unit_test
+def test_random_sample_from_dnd(dnd: QDND):
+    # store single non terminal transition
+    embeddings = [np.random.rand(EMBEDDING_SIZE) for j in range(NUM_DND_ENTRIES_TO_ADD)]
+    actions = [np.random.randint(NUM_ACTIONS) for j in range(NUM_DND_ENTRIES_TO_ADD)]
+    values = [np.random.rand() for j in range(NUM_DND_ENTRIES_TO_ADD)]
+    dnd.add(embeddings, actions, values)
+    dnd_embeddings, dnd_values, dnd_indices = dnd.query(embeddings[0:10], 0, NUM_NEIGHBORS)
+
+    # calculate_normalization_factor
+    sampled_embeddings = dnd.sample_embeddings(NUM_SAMPLED_EMBEDDINGS)
+    coefficient = 1/(NUM_SAMPLED_EMBEDDINGS * (NUM_SAMPLED_EMBEDDINGS - 1.0))
+    tf_current_embedding = tf.placeholder(tf.float32, shape=(EMBEDDING_SIZE), name='current_embedding')
+    tf_other_embeddings = tf.placeholder(tf.float32, shape=(NUM_SAMPLED_EMBEDDINGS - 1, EMBEDDING_SIZE), name='other_embeddings')
+
+    sub = tf_current_embedding - tf_other_embeddings
+    square = tf.square(sub)
+    result = tf.reduce_sum(square)
+
+
+
+    ###########################
+    # more efficient method
+    ###########################
+    sampled_embeddings_expanded = tf.placeholder(
+        tf.float32, shape=(1, NUM_SAMPLED_EMBEDDINGS, EMBEDDING_SIZE), name='sampled_embeddings_expanded')
+    sampled_embeddings_tiled = tf.tile(sampled_embeddings_expanded, (sampled_embeddings_expanded.shape[1], 1, 1))
+    sampled_embeddings_transposed = tf.transpose(sampled_embeddings_tiled, (1, 0, 2))
+    sub2 = sampled_embeddings_tiled - sampled_embeddings_transposed
+    square2 = tf.square(sub2)
+    result2 = tf.reduce_sum(square2)
+
+    config = tf.ConfigProto()
+    config.allow_soft_placement = True  # allow placing ops on cpu if they are not fit for gpu
+    config.gpu_options.allow_growth = True  # allow the gpu memory allocated for the worker to grow if needed
+
+    sess = tf.Session(config=config)
+
+    sum1 = 0
+    start = time.time()
+    for i in range(NUM_SAMPLED_EMBEDDINGS):
+        curr_sampled_embedding = sampled_embeddings[i]
+        other_embeddings = np.delete(sampled_embeddings, i, 0)
+        sum1 += sess.run(result, feed_dict={tf_current_embedding: curr_sampled_embedding, tf_other_embeddings: other_embeddings})
+    print("1st method: {} sec".format(time.time()-start))
+
+    start = time.time()
+    sum2 = sess.run(result2, feed_dict={sampled_embeddings_expanded: np.expand_dims(sampled_embeddings,0)})
+    print("2nd method: {} sec".format(time.time()-start))
+
+    # validate that results are equal
+    print("sum1 = {}, sum2 = {}".format(sum1, sum2))
+
+    norm_factor = -0.5/(coefficient * sum2)
+
+if __name__ == '__main__':
+    test_random_sample_from_dnd(dnd())
+
diff --git a/rl_coach/tests/memories/test_hindsight_experience_replay.py b/rl_coach/tests/memories/test_hindsight_experience_replay.py
new file mode 100644
index 0000000..9efbd45
--- /dev/null
+++ b/rl_coach/tests/memories/test_hindsight_experience_replay.py
@@ -0,0 +1,97 @@
+# nasty hack to deal with issue #46
+import os
+import sys
+
+from rl_coach.memories.episodic.episodic_hindsight_experience_replay import EpisodicHindsightExperienceReplayParameters
+from rl_coach.spaces import GoalsSpace, ReachingGoal
+
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+# print(sys.path)
+
+import pytest
+import numpy as np
+
+from rl_coach.core_types import Transition, Episode
+from rl_coach.memories.memory import MemoryGranularity
+from rl_coach.memories.episodic.episodic_hindsight_experience_replay import EpisodicHindsightExperienceReplay, \
+     HindsightGoalSelectionMethod
+
+
+#TODO: change from defining a new class to creating an instance from the parameters
+class Parameters(EpisodicHindsightExperienceReplayParameters):
+    def __init__(self):
+        super().__init__()
+        self.max_size = (MemoryGranularity.Transitions, 100)
+        self.hindsight_transitions_per_regular_transition = 4
+        self.hindsight_goal_selection_method = HindsightGoalSelectionMethod.Future
+        self.goals_space = GoalsSpace(goal_name='observation',
+                                      reward_type=ReachingGoal(distance_from_goal_threshold=0.1),
+                                      distance_metric=GoalsSpace.DistanceMetric.Euclidean)
+
+
+@pytest.fixture
+def episode():
+    episode = []
+    for i in range(10):
+        episode.append(Transition(
+            state={'observation': np.array([i]), 'desired_goal': np.array([i]), 'achieved_goal': np.array([i])},
+            action=i,
+        ))
+    return episode
+
+
+@pytest.fixture
+def her():
+    params = Parameters().__dict__
+
+    import inspect
+    args = set(inspect.getfullargspec(EpisodicHindsightExperienceReplay.__init__).args).intersection(params)
+    params = {k: params[k] for k in args}
+
+    return EpisodicHindsightExperienceReplay(**params)
+
+
+@pytest.mark.unit_test
+def test_sample_goal(her, episode):
+    assert her._sample_goal(episode, 8) == 9
+
+
+@pytest.mark.unit_test
+def test_sample_goal_range(her, episode):
+    unseen_goals = set(range(1, 9))
+    for _ in range(500):
+        unseen_goals -= set([int(her._sample_goal(episode, 0))])
+        if not unseen_goals:
+            return
+
+    assert unseen_goals == set()
+
+
+@pytest.mark.unit_test
+def test_update_episode(her):
+    episode = Episode()
+    for i in range(10):
+        episode.insert(Transition(
+            state={'observation': np.array([i]), 'desired_goal': np.array([i+1]), 'achieved_goal': np.array([i+1])},
+            action=i,
+            game_over=i == 9,
+            reward=0 if i == 9 else -1,
+        ))
+
+    her.store_episode(episode)
+    # print('her._num_transitions', her._num_transitions)
+
+    # 10 original transitions, and 9 transitions * 4 hindsight episodes
+    assert her.num_transitions() == 10 + (4 * 9)
+
+    # make sure that the goal state was never sampled from the past
+    for transition in her.transitions:
+        assert transition.state['desired_goal'] > transition.state['observation']
+        assert transition.next_state['desired_goal'] >= transition.next_state['observation']
+
+        if transition.reward == 0:
+            assert transition.game_over
+        else:
+            assert not transition.game_over
+
+test_update_episode(her())
\ No newline at end of file
diff --git a/rl_coach/tests/memories/test_prioritized_experience_replay.py b/rl_coach/tests/memories/test_prioritized_experience_replay.py
new file mode 100644
index 0000000..020c4a1
--- /dev/null
+++ b/rl_coach/tests/memories/test_prioritized_experience_replay.py
@@ -0,0 +1,93 @@
+# nasty hack to deal with issue #46
+import os
+import sys
+
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+
+from rl_coach.memories.non_episodic.prioritized_experience_replay import SegmentTree
+
+
+@pytest.mark.unit_test
+def test_sum_tree():
+    # test power of 2 sum tree
+    sum_tree = SegmentTree(size=4, operation=SegmentTree.Operation.SUM)
+    sum_tree.add(10, "10")
+    assert sum_tree.total_value() == 10
+    sum_tree.add(20, "20")
+    assert sum_tree.total_value() == 30
+    sum_tree.add(5, "5")
+    assert sum_tree.total_value() == 35
+    sum_tree.add(7.5, "7.5")
+    assert sum_tree.total_value() == 42.5
+    sum_tree.add(2.5, "2.5")
+    assert sum_tree.total_value() == 35
+    sum_tree.add(5, "5")
+    assert sum_tree.total_value() == 20
+
+    assert sum_tree.get(2) == (0, 2.5, '2.5')
+    assert sum_tree.get(3) == (1, 5.0, '5')
+    assert sum_tree.get(10) == (2, 5.0, '5')
+    assert sum_tree.get(13) == (3, 7.5, '7.5')
+
+    sum_tree.update(2, 10)
+    assert sum_tree.__str__() == "[25.]\n[ 7.5 17.5]\n[ 2.5  5.  10.   7.5]\n"
+
+    # test non power of 2 sum tree
+    with pytest.raises(ValueError):
+        sum_tree = SegmentTree(size=5, operation=SegmentTree.Operation.SUM)
+
+
+@pytest.mark.unit_test
+def test_min_tree():
+    min_tree = SegmentTree(size=4, operation=SegmentTree.Operation.MIN)
+    min_tree.add(10, "10")
+    assert min_tree.total_value() == 10
+    min_tree.add(20, "20")
+    assert min_tree.total_value() == 10
+    min_tree.add(5, "5")
+    assert min_tree.total_value() == 5
+    min_tree.add(7.5, "7.5")
+    assert min_tree.total_value() == 5
+    min_tree.add(2, "2")
+    assert min_tree.total_value() == 2
+    min_tree.add(3, "3")
+    min_tree.add(3, "3")
+    min_tree.add(3, "3")
+    min_tree.add(5, "5")
+    assert min_tree.total_value() == 3
+
+
+@pytest.mark.unit_test
+def test_max_tree():
+    max_tree = SegmentTree(size=4, operation=SegmentTree.Operation.MAX)
+    max_tree.add(10, "10")
+    assert max_tree.total_value() == 10
+    max_tree.add(20, "20")
+    assert max_tree.total_value() == 20
+    max_tree.add(5, "5")
+    assert max_tree.total_value() == 20
+    max_tree.add(7.5, "7.5")
+    assert max_tree.total_value() == 20
+    max_tree.add(2, "2")
+    assert max_tree.total_value() == 20
+    max_tree.add(3, "3")
+    max_tree.add(3, "3")
+    max_tree.add(3, "3")
+    max_tree.add(5, "5")
+    assert max_tree.total_value() == 5
+
+    # update
+    max_tree.update(1, 10)
+    assert max_tree.total_value() == 10
+    assert max_tree.__str__() == "[10.]\n[10.  3.]\n[ 5. 10.  3.  3.]\n"
+    max_tree.update(1, 2)
+    assert max_tree.total_value() == 5
+    assert max_tree.__str__() == "[5.]\n[5. 3.]\n[5. 2. 3. 3.]\n"
+
+
+if __name__ == "__main__":
+    test_sum_tree()
+    test_min_tree()
+    test_max_tree()
diff --git a/rl_coach/tests/memories/test_single_episode_buffer.py b/rl_coach/tests/memories/test_single_episode_buffer.py
new file mode 100644
index 0000000..c2f12ab
--- /dev/null
+++ b/rl_coach/tests/memories/test_single_episode_buffer.py
@@ -0,0 +1,81 @@
+# nasty hack to deal with issue #46
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import numpy as np
+
+from rl_coach.core_types import Transition
+from rl_coach.memories.episodic.single_episode_buffer import SingleEpisodeBuffer
+
+
+@pytest.fixture()
+def buffer():
+    return SingleEpisodeBuffer()
+
+
+@pytest.mark.unit_test
+def test_store_and_get(buffer: SingleEpisodeBuffer):
+    # store single non terminal transition
+    transition = Transition(state={"observation": np.array([1, 2, 3])}, action=1, reward=1, game_over=False)
+    buffer.store(transition)
+    assert buffer.length() == 1
+    assert buffer.num_complete_episodes() == 0
+    assert buffer.num_transitions_in_complete_episodes() == 0
+    assert buffer.num_transitions() == 1
+
+    # get the single stored transition
+    episode = buffer.get(0)
+    assert episode.length() == 1
+    assert episode.get_first_transition() is transition    # check addresses are the same
+    assert episode.get_last_transition() is transition   # check addresses are the same
+
+    # store single terminal transition
+    transition = Transition(state={"observation": np.array([1, 2, 3])}, action=1, reward=1, game_over=True)
+    buffer.store(transition)
+    assert buffer.length() == 1
+    assert buffer.num_complete_episodes() == 1
+    assert buffer.num_transitions_in_complete_episodes() == 2
+
+    # check that the episode is valid
+    episode = buffer.get(0)
+    assert episode.length() == 2
+    assert episode.get_transition(0).total_return == 1 + 0.99
+    assert episode.get_transition(1).total_return == 1
+    assert buffer.mean_reward() == 1
+
+    # only one episode in the replay buffer
+    episode = buffer.get(1)
+    assert episode is None
+
+    # adding transitions after the first episode was closed
+    transition = Transition(state={"observation": np.array([1, 2, 3])}, action=1, reward=0, game_over=False)
+    buffer.store(transition)
+    assert buffer.length() == 1
+    assert buffer.num_complete_episodes() == 0
+    assert buffer.num_transitions_in_complete_episodes() == 0
+
+    # still only one episode
+    assert buffer.get(1) is None
+    assert buffer.mean_reward() == 0
+
+
+@pytest.mark.unit_test
+def test_clean(buffer: SingleEpisodeBuffer):
+    # add several transitions and then clean the buffer
+    transition = Transition(state={"observation": np.array([1, 2, 3])}, action=1, reward=1, game_over=False)
+    for i in range(10):
+        buffer.store(transition)
+    assert buffer.num_transitions() == 10
+    buffer.clean()
+    assert buffer.num_transitions() == 0
+
+    # add more transitions after the clean and make sure they were really cleaned
+    transition = Transition(state={"observation": np.array([1, 2, 3])}, action=1, reward=1, game_over=True)
+    buffer.store(transition)
+    assert buffer.num_transitions() == 1
+    assert buffer.num_transitions_in_complete_episodes() == 1
+    assert buffer.num_complete_episodes() == 1
+    for i in range(10):
+        assert buffer.sample(1)[0] is transition
diff --git a/rl_coach/tests/presets/__init__.py b/rl_coach/tests/presets/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/rl_coach/tests/presets/test_presets.py b/rl_coach/tests/presets/test_presets.py
new file mode 100644
index 0000000..2b0bb90
--- /dev/null
+++ b/rl_coach/tests/presets/test_presets.py
@@ -0,0 +1,56 @@
+# nasty hack to deal with issue #46
+import os
+import sys
+
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+import os
+import time
+import shutil
+from subprocess import Popen, DEVNULL
+from rl_coach.logger import screen
+
+
+@pytest.mark.integration_test
+def test_all_presets_are_running():
+    # os.chdir("../../")
+    test_failed = False
+    all_presets = sorted([f.split('.')[0] for f in os.listdir('rl_coach/presets') if f.endswith('.py') and f != '__init__.py'])
+    for preset in all_presets:
+        print("Testing preset {}".format(preset))
+
+        # TODO: this is a temporary workaround for presets which define more than a single available level.
+        # we should probably do this in a more robust way
+        level = ""
+        if "Atari" in preset:
+            level = "breakout"
+        elif "Mujoco" in preset:
+            level = "inverted_pendulum"
+        elif "ControlSuite" in preset:
+            level = "pendulum:swingup"
+        params = ["python3", "rl_coach/coach.py", "-p", preset, "-ns", "-e", ".test"]
+        if level != "":
+            params += ["-lvl", level]
+
+        p = Popen(params, stdout=DEVNULL)
+
+        # wait 10 seconds overhead of initialization etc.
+        time.sleep(10)
+        return_value = p.poll()
+
+        if return_value is None:
+            screen.success("{} passed successfully".format(preset))
+        else:
+            test_failed = True
+            screen.error("{} failed".format(preset), crash=False)
+
+        p.kill()
+        if os.path.exists("experiments/.test"):
+            shutil.rmtree("experiments/.test")
+
+    assert not test_failed
+
+
+if __name__ == "__main__":
+    test_all_presets_are_running()
diff --git a/rl_coach/tests/pytest.ini b/rl_coach/tests/pytest.ini
new file mode 100644
index 0000000..29d3f1d
--- /dev/null
+++ b/rl_coach/tests/pytest.ini
@@ -0,0 +1,5 @@
+# content of pytest.ini
+[pytest]
+markers =
+    unit_test: short test that checks that a module is acting correctly
+    integration_test: long test that checks that the complete framework is running correctly
\ No newline at end of file
diff --git a/rl_coach/tests/test_schedules.py b/rl_coach/tests/test_schedules.py
new file mode 100644
index 0000000..022cd89
--- /dev/null
+++ b/rl_coach/tests/test_schedules.py
@@ -0,0 +1,106 @@
+import os
+import sys
+
+from rl_coach.core_types import EnvironmentSteps
+
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+
+from rl_coach.schedules import LinearSchedule, ConstantSchedule, ExponentialSchedule, PieceWiseSchedule
+import numpy as np
+
+
+@pytest.mark.unit_test
+def test_constant_schedule():
+    schedule = ConstantSchedule(0.3)
+
+    # make sure the values in the constant schedule don't change over time
+    for i in range(1000):
+        assert schedule.initial_value == 0.3
+        assert schedule.current_value == 0.3
+        schedule.step()
+
+
+@pytest.mark.unit_test
+def test_linear_schedule():
+    # increasing schedule
+    schedule = LinearSchedule(1, 3, 10)
+
+    # the schedule is defined in number of steps to get from 1 to 3 so there are 10 steps
+    # the linspace is defined in number of bins between 1 and 3 so theres are 11 bins
+    target_values = np.linspace(1, 3, 11)
+    for i in range(10):
+        # we round to 4 because there is a very small floating point division difference (1e-10)
+        assert round(schedule.current_value, 4) == round(target_values[i], 4)
+        schedule.step()
+
+    # make sure the value does not change after 10 steps
+    for i in range(10):
+        assert schedule.current_value == 3
+
+    # decreasing schedule
+    schedule = LinearSchedule(3, 1, 10)
+
+    target_values = np.linspace(3, 1, 11)
+    for i in range(10):
+        # we round to 4 because there is a very small floating point division difference (1e-10)
+        assert round(schedule.current_value, 4) == round(target_values[i], 4)
+        schedule.step()
+
+    # make sure the value does not change after 10 steps
+    for i in range(10):
+        assert schedule.current_value == 1
+
+    # constant schedule
+    schedule = LinearSchedule(3, 3, 10)
+
+    for i in range(10):
+        # we round to 4 because there is a very small floating point division difference (1e-10)
+        assert round(schedule.current_value, 4) == 3
+        schedule.step()
+
+
+@pytest.mark.unit_test
+def test_exponential_schedule():
+    # decreasing schedule
+    schedule = ExponentialSchedule(10, 3, 0.99)
+
+    current_power = 1
+    for i in range(100):
+        assert round(schedule.current_value,6) == round(10*current_power,6)
+        current_power *= 0.99
+        schedule.step()
+
+    for i in range(100):
+        schedule.step()
+    assert schedule.current_value == 3
+
+
+@pytest.mark.unit_test
+def test_piece_wise_schedule():
+    # decreasing schedule
+    schedule = PieceWiseSchedule(
+        [(LinearSchedule(1, 3, 10), EnvironmentSteps(5)),
+         (ConstantSchedule(4), EnvironmentSteps(10)),
+         (ExponentialSchedule(3, 1, 0.99), EnvironmentSteps(10))
+         ]
+    )
+
+    target_values = np.append(np.linspace(1, 2, 6), np.ones(11)*4)
+    for i in range(16):
+        assert round(schedule.current_value, 4) == round(target_values[i], 4)
+        schedule.step()
+
+    current_power = 1
+    for i in range(10):
+        assert round(schedule.current_value, 4) == round(3*current_power, 4)
+        current_power *= 0.99
+        schedule.step()
+
+
+if __name__ == "__main__":
+    test_constant_schedule()
+    test_linear_schedule()
+    test_exponential_schedule()
+    test_piece_wise_schedule()
diff --git a/rl_coach/tests/test_spaces.py b/rl_coach/tests/test_spaces.py
new file mode 100644
index 0000000..e50e9f9
--- /dev/null
+++ b/rl_coach/tests/test_spaces.py
@@ -0,0 +1,198 @@
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
+
+import pytest
+from rl_coach.spaces import DiscreteActionSpace, BoxActionSpace, MultiSelectActionSpace, ObservationSpace, AgentSelection, VectorObservationSpace, AttentionActionSpace
+import numpy as np
+
+
+@pytest.mark.unit_test
+def test_discrete():
+    action_space = DiscreteActionSpace(3, ["zero", "one", "two"])
+    assert action_space.shape == 1
+    for i in range(100):
+        assert 3 > action_space.sample() >= 0
+    action_info = action_space.sample_with_info()
+    assert action_info.action_probability == 1. / 3
+    assert action_space.high == 2
+    assert action_space.low == 0
+
+    # list descriptions
+    assert action_space.get_description(1) == "one"
+
+    # dict descriptions
+    action_space = DiscreteActionSpace(3, {1: "one", 2: "two", 0: "zero"})
+    assert action_space.get_description(0) == "zero"
+
+    # no descriptions
+    action_space = DiscreteActionSpace(3)
+    assert action_space.get_description(0) == "0"
+
+    # descriptions for invalid action
+    with pytest.raises(ValueError):
+        assert action_space.get_description(3) == "0"
+
+
+@pytest.mark.unit_test
+def test_box():
+    # simple action space
+    action_space = BoxActionSpace(4, -5, 5, ["a", "b", "c", "d"])
+    for i in range(100):
+        sample = action_space.sample()
+        assert np.all(-5 <= sample) and np.all(sample <= 5)
+        assert sample.shape == (4,)
+        assert sample.dtype == float
+
+    # test clipping
+    clipped_action = action_space.clip_action_to_space(np.array([-10, 10, 2, 5]))
+    assert np.all(clipped_action == np.array([-5, 5, 2, 5]))
+
+    # more complex high and low definition
+    action_space = BoxActionSpace(4, np.array([-5, -1, -0.5, 0]), np.array([1, 2, 4, 5]), ["a", "b", "c", "d"])
+    for i in range(100):
+        sample = action_space.sample()
+        assert np.all(np.array([-5, -1, -0.5, 0]) <= sample) and np.all(sample <= np.array([1, 2, 4, 5]))
+        assert sample.shape == (4,)
+        assert sample.dtype == float
+
+    # test clipping
+    clipped_action = action_space.clip_action_to_space(np.array([-10, 10, 2, 5]))
+    assert np.all(clipped_action == np.array([-5, 2, 2, 5]))
+
+    # mixed high and low definition
+    action_space = BoxActionSpace(4, np.array([-5, -1, -0.5, 0]), 5, ["a", "b", "c", "d"])
+    for i in range(100):
+        sample = action_space.sample()
+        assert np.all(np.array([-5, -1, -0.5, 0]) <= sample) and np.all(sample <= 5)
+        assert sample.shape == (4,)
+        assert sample.dtype == float
+
+    # test clipping
+    clipped_action = action_space.clip_action_to_space(np.array([-10, 10, 2, 5]))
+    assert np.all(clipped_action == np.array([-5, 5, 2, 5]))
+
+    # invalid bounds
+    with pytest.raises(ValueError):
+        action_space = BoxActionSpace(4, np.array([-5, -1, -0.5, 0]), -1, ["a", "b", "c", "d"])
+
+    # TODO: test descriptions
+
+
+@pytest.mark.unit_test
+def test_multiselect():
+    action_space = MultiSelectActionSpace(4, 2, ["a", "b", "c", "d"])
+    for i in range(100):
+        action = action_space.sample()
+        assert action.shape == (4,)
+        assert np.sum(action) <= 2
+
+    # check that descriptions of multiple actions are working
+    description = action_space.get_description(np.array([1, 0, 1, 0]))
+    assert description == "a + c"
+
+    description = action_space.get_description(np.array([0, 0, 0, 0]))
+    assert description == "no-op"
+
+
+@pytest.mark.unit_test
+def test_attention():
+    low = np.array([-1, -2, -3, -4])
+    high = np.array([1, 2, 3, 4])
+    action_space = AttentionActionSpace(4, low=low, high=high)
+    for i in range(100):
+        action = action_space.sample()
+        assert len(action) == 2
+        assert action[0].shape == (4,)
+        assert action[1].shape == (4,)
+        assert np.all(action[0] <= action[1])
+        assert np.all(action[0] >= low)
+        assert np.all(action[1] < high)
+
+
+@pytest.mark.unit_test
+def test_goal():
+    # TODO: test goal action space
+    pass
+
+
+@pytest.mark.unit_test
+def test_agent_selection():
+    action_space = AgentSelection(10)
+
+    assert action_space.shape == 1
+    assert action_space.high == 9
+    assert action_space.low == 0
+    with pytest.raises(ValueError):
+        assert action_space.get_description(10)
+    assert action_space.get_description(0) == "0"
+
+
+@pytest.mark.unit_test
+def test_observation_space():
+    observation_space = ObservationSpace(np.array([1, 10]), -10, 10)
+
+    # testing that val_matches_space_definition works
+    assert observation_space.val_matches_space_definition(np.ones([1, 10]))
+    assert not observation_space.val_matches_space_definition(np.ones([2, 10]))
+    assert not observation_space.val_matches_space_definition(np.ones([1, 10]) * 100)
+    assert not observation_space.val_matches_space_definition(np.ones([1, 1, 10]))
+
+    # is_point_in_space_shape
+    assert observation_space.is_point_in_space_shape(np.array([0, 9]))
+    assert observation_space.is_point_in_space_shape(np.array([0, 0]))
+    assert not observation_space.is_point_in_space_shape(np.array([1, 8]))
+    assert not observation_space.is_point_in_space_shape(np.array([0, 10]))
+    assert not observation_space.is_point_in_space_shape(np.array([-1, 6]))
+
+
+@pytest.mark.unit_test
+def test_image_observation_space():
+    # TODO: test image observation space
+    pass
+
+
+@pytest.mark.unit_test
+def test_measurements_observation_space():
+    # empty measurements space
+    measurements_space = VectorObservationSpace(0)
+
+    # vector space
+    measurements_space = VectorObservationSpace(3, measurements_names=['a', 'b', 'c'])
+
+
+@pytest.mark.unit_test
+def test_reward_space():
+    # TODO: test reward space
+    pass
+
+
+# def test_discrete_to_linspace_action_space_map():
+#     box = BoxActionSpace(2, np.array([0, 0]), np.array([10, 10]))
+#     linspace = BoxDiscretization(box, [5, 3])
+#     assert np.all(linspace.actions == np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]))
+#     assert np.all(linspace.target_actions ==
+#                   np.array([[0.0, 0.0], [0.0, 5.0], [0.0, 10.0],
+#                             [2.5, 0.0], [2.5, 5.0], [2.5, 10.0],
+#                             [5.0, 0.0], [5.0, 5.0], [5.0, 10.0],
+#                             [7.5, 0.0], [7.5, 5.0], [7.5, 10.0],
+#                             [10.0, 0.0], [10.0, 5.0], [10.0, 10.0]]))
+#
+#
+# def test_discrete_to_attention_action_space_map():
+#     attention = AttentionActionSpace(2, np.array([0, 0]), np.array([10, 10]))
+#     linspace = AttentionDiscretization(attention, 2)
+#     assert np.all(linspace.actions == np.array([0, 1, 2, 3]))
+#     assert np.all(linspace.target_actions ==
+#                   np.array(
+#                       [[[0., 0.], [5., 5.]],
+#                       [[0., 5.], [5., 10.]],
+#                       [[5., 0.], [10., 5.]],
+#                       [[5., 5.], [10., 10.]]])
+#                   )
+
+
+if __name__ == "__main__":
+    test_observation_space()
+    test_discrete_to_linspace_action_space_map()
+    test_discrete_to_attention_action_space_map()
diff --git a/utils.py b/rl_coach/utils.py
similarity index 52%
rename from utils.py
rename to rl_coach/utils.py
index 7f75ac5..c65ad87 100644
--- a/utils.py
+++ b/rl_coach/utils.py
@@ -14,51 +14,38 @@
 # limitations under the License.
 #
 
-import json
+import importlib
+import importlib.util
 import inspect
+import json
 import os
-import numpy as np
-import threading
-from subprocess import call, Popen
 import signal
-import copy
+import threading
+import time
+from subprocess import Popen
+from typing import List, Tuple
+
+import numpy as np
+from multiprocessing import Manager
 
 killed_processes = []
 
 eps = np.finfo(np.float32).eps
 
-class Enum(object):
-    def __init__(self):
-        pass
 
-    def keys(self):
-        return [attr.lower() for attr in dir(self) if not callable(getattr(self, attr)) and not attr.startswith("__")]
-
-    def vals(self):
-        vars = dict(inspect.getmembers(self, lambda a: not (inspect.isroutine(a))))
-        return {key.lower(): vars[key] for key in vars}
-
-    def get(self, string):
-        if string.lower() in self.keys():
-            return self.vals()[string.lower()]
-        raise NameError('enum does not exist')
-
-    def verify(self, string):
-        if string.lower() in self.keys():
-            return string.lower(), self.vals()[string.lower()]
-        raise NameError('enum does not exist')
-
-    def to_string(self, enum):
-        for key, val in self.vals().items():
-            if val == enum:
-                return key
-        raise NameError('enum does not exist')
+def lower_under_to_upper(s):
+    s = s.replace('_', ' ')
+    s = s.title()
+    return s.replace(' ', '')
 
 
-class RunPhase(Enum):
-    HEATUP = "Heatup"
-    TRAIN = "Training"
-    TEST = "Testing"
+def get_base_dir():
+    return os.path.dirname(os.path.realpath(__file__))
+
+
+def list_all_presets():
+    presets_path = os.path.join(get_base_dir(), 'presets')
+    return [f.split('.')[0] for f in os.listdir(presets_path) if f.endswith('.py') and f != '__init__.py']
 
 
 def list_all_classes_in_module(module):
@@ -133,7 +120,6 @@ def parse_int(value):
 
 def set_gpu(gpu_id):
     os.environ['CUDA_VISIBLE_DEVICES'] = str(gpu_id)
-    os.environ['NVIDIA_VISIBLE_DEVICES'] = str(gpu_id)
 
 
 def set_cpu():
@@ -207,6 +193,12 @@ class Signal(object):
         else:
             return self.values
 
+    def get_last_value(self):
+        if len(self.values) == 0:
+            return np.nan
+        else:
+            return self._get_values()[-1]
+
     def get_mean(self):
         if len(self.values) == 0:
             return ''
@@ -334,23 +326,6 @@ def switch_axes_order(observation, from_type='channels_first', to_type='channels
         return np.transpose(observation, (1, 0))
 
 
-class LazyStack(object):
-    """
-    A lazy version of np.stack which avoids copying the memory until it is
-    needed.
-    """
-
-    def __init__(self, history, axis=None):
-        self.history = copy.copy(history)
-        self.axis = axis
-
-    def __array__(self, dtype=None):
-        array = np.stack(self.history, axis=self.axis)
-        if dtype is not None:
-            array = array.astype(dtype)
-        return array
-
-
 def stack_observation(curr_stack, observation, stack_size):
     """
     Adds a new observation to an existing stack of observations from previous time-steps.
@@ -371,6 +346,117 @@ def stack_observation(curr_stack, observation, stack_size):
     return curr_stack
 
 
+def call_method_for_all(instances: List, method: str, args=[], kwargs={}) -> List:
+    """
+    Calls the same function for all the class instances in the group
+    :param instances: a list of class instances to apply the method on
+    :param method: the name of the function to be called
+    :param args: the positional parameters of the method
+    :param kwargs: the named parameters of the method
+    :return: a list of the returns values for all the instances
+    """
+    result = []
+    if not isinstance(args, list):
+        args = [args]
+    sub_methods = method.split('.')  # we allow calling an internal method such as "as_level_manager.train"
+    for instance in instances:
+        sub_instance = instance
+        for sub_method in sub_methods:
+            if not hasattr(sub_instance, sub_method):
+                raise ValueError("The requested instance method {} does not exist for {}"
+                                 .format(sub_method, '.'.join([str(instance.__class__.__name__)] + sub_methods)))
+            sub_instance = getattr(sub_instance, sub_method)
+        result.append(sub_instance(*args, **kwargs))
+    return result
+
+
+def set_member_values_for_all(instances: List, member: str, val) -> None:
+    """
+    Calls the same function for all the class instances in the group
+    :param instances: a list of class instances to apply the method on
+    :param member: the name of the member to be changed
+    :param val: the new value to assign
+    :return: None
+    """
+    for instance in instances:
+        if not hasattr(instance, member):
+            raise ValueError("The requested instance member does not exist")
+        setattr(instance, member, val)
+
+
+def short_dynamic_import(module_path_and_attribute: str, ignore_module_case: bool=False):
+    """
+    Import by "path:attribute"
+    :param module_path_and_attribute: a path to a python file (using dots to separate dirs), followed by a ":" and
+                                      an attribute name to import from the path
+    :return: the requested attribute
+    """
+    if '/' in module_path_and_attribute:
+        """
+        Imports a class from a module using the full path of the module. The path should be given as:
+        <full absolute module path with / including .py>:<class name to import>
+        And this will be the same as doing "from <full absolute module path> import <class name to import>"
+        """
+        return dynamic_import_from_full_path(*module_path_and_attribute.split(':'),
+                                             ignore_module_case=ignore_module_case)
+    else:
+        """
+        Imports a class from a module using the relative path of the module. The path should be given as:
+        <full absolute module path with . and not including .py>:<class name to import>
+        And this will be the same as doing "from <full relative module path> import <class name to import>"
+        """
+        return dynamic_import(*module_path_and_attribute.split(':'),
+                              ignore_module_case=ignore_module_case)
+
+
+def dynamic_import(module_path: str, class_name: str, ignore_module_case: bool=False):
+    if ignore_module_case:
+        module_name = module_path.split(".")[-1]
+        available_modules = os.listdir(os.path.dirname(module_path.replace('.', '/')))
+        for module in available_modules:
+            curr_module_ext = module.split('.')[-1].lower()
+            curr_module_name = module.split('.')[0]
+            if curr_module_ext == "py" and curr_module_name.lower() == module_name.lower():
+                module_path = '.'.join(module_path.split(".")[:-1] + [curr_module_name])
+    module = importlib.import_module(module_path)
+    class_ref = getattr(module, class_name)
+    return class_ref
+
+
+def dynamic_import_from_full_path(module_path: str, class_name: str, ignore_module_case: bool=False):
+    if ignore_module_case:
+        module_name = module_path.split("/")[-1]
+        available_modules = os.listdir(os.path.dirname(module_path))
+        for module in available_modules:
+            curr_module_ext = module.split('.')[-1].lower()
+            curr_module_name = module.split('.')[0]
+            if curr_module_ext == "py" and curr_module_name.lower() == module_name.lower():
+                module_path = '.'.join(module_path.split("/")[:-1] + [curr_module_name])
+    spec = importlib.util.spec_from_file_location("module", module_path)
+    module = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(module)
+    class_ref = getattr(module, class_name)
+    return class_ref
+
+
+def dynamic_import_and_instantiate_module_from_params(module_parameters, path=None, positional_args=[],
+                                                      extra_kwargs={}):
+    """
+    A function dedicated for coach modules like memory, exploration policy, etc.
+    Given the module parameters, it imports it and instantiates it.
+    :param module_parameters:
+    :return:
+    """
+    import inspect
+    if path is None:
+        path = module_parameters.path
+    module = short_dynamic_import(path)
+    args = set(inspect.getfullargspec(module).args).intersection(module_parameters.__dict__)
+    args = {k: module_parameters.__dict__[k] for k in args}
+    args = {**args, **extra_kwargs}
+    return short_dynamic_import(path)(*positional_args, **args)
+
+
 def last_sample(state):
     """
     given a batch of states, return the last sample of the batch with length 1
@@ -380,3 +466,86 @@ def last_sample(state):
         k: np.expand_dims(v[-1], 0)
         for k, v in state.items()
     }
+
+
+def get_all_subclasses(cls):
+    if len(cls.__subclasses__()) == 0:
+        return []
+    ret = []
+    for drv in cls.__subclasses__():
+        ret.append(drv)
+        ret.extend(get_all_subclasses(drv))
+
+    return ret
+
+
+class SharedMemoryScratchPad(object):
+    def __init__(self):
+        self.dict = {}
+
+    def add(self, key, value):
+        self.dict[key] = value
+
+    def get(self, key, timeout=30):
+        start_time = time.time()
+        timeout_passed = False
+        while key not in self.dict and not timeout_passed:
+            time.sleep(0.1)
+            timeout_passed = (time.time() - start_time) > timeout
+
+        if timeout_passed:
+            return None
+        return self.dict[key]
+
+    def internal_call(self, key, func, args: Tuple):
+        if type(args) != tuple:
+            args = (args,)
+        return getattr(self.dict[key], func)(*args)
+
+
+class Timer(object):
+    def __init__(self, prefix):
+        self.prefix = prefix
+
+    def __enter__(self):
+        self.start = time.time()
+
+    def __exit__(self, type, value, traceback):
+        print(self.prefix, time.time() - self.start)
+
+
+class ReaderWriterLock(object):
+    def __init__(self):
+        self.num_readers_lock = Manager().Lock()
+        self.writers_lock = Manager().Lock()
+        self.num_readers = 0
+        self.now_writing = False
+
+    def some_worker_is_reading(self):
+        return self.num_readers > 0
+
+    def some_worker_is_writing(self):
+        return self.now_writing is True
+
+    def lock_writing_and_reading(self):
+        self.writers_lock.acquire()  # first things first - block all other writers
+        self.now_writing = True  # block new readers who haven't started reading yet
+        while self.some_worker_is_reading():  # let existing readers finish their homework
+            time.sleep(0.05)
+
+    def release_writing_and_reading(self):
+        self.now_writing = False  # release readers - guarantee no readers starvation
+        self.writers_lock.release()  # release writers
+
+    def lock_writing(self):
+        while self.now_writing:
+            time.sleep(0.05)
+
+        self.num_readers_lock.acquire()
+        self.num_readers += 1
+        self.num_readers_lock.release()
+
+    def release_writing(self):
+        self.num_readers_lock.acquire()
+        self.num_readers -= 1
+        self.num_readers_lock.release()
\ No newline at end of file
diff --git a/run_test.py b/run_test.py
deleted file mode 100644
index 196056e..0000000
--- a/run_test.py
+++ /dev/null
@@ -1,220 +0,0 @@
-#
-# Copyright (c) 2017 Intel Corporation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-# -*- coding: utf-8 -*-
-import presets
-import numpy as np
-import pandas as pd
-from os import path
-import os
-import glob
-import shutil
-import sys
-import time
-from logger import screen
-from utils import list_all_classes_in_module, threaded_cmd_line_run, killed_processes
-import subprocess
-import signal
-import argparse
-
-
-if __name__ == '__main__':
-    parser = argparse.ArgumentParser()
-    parser.add_argument('-p', '--preset',
-                        help="(string) Name of a preset to run (as configured in presets.py)",
-                        default=None,
-                        type=str)
-    parser.add_argument('-ip', '--ignore_presets',
-                        help="(string) Name of a preset(s) to ignore (comma separated, and as configured in presets.py)",
-                        default=None,
-                        type=str)
-    parser.add_argument('-itf', '--ignore_tensorflow',
-                        help="(flag) Don't test TensorFlow presets.",
-                        action='store_true')
-    parser.add_argument('-in', '--ignore_neon',
-                        help="(flag) Don't test neon presets.",
-                        action='store_true')
-    parser.add_argument('-v', '--verbose',
-                        help="(flag) display verbose logs in the event of an error",
-                        action='store_true')
-    parser.add_argument('-l', '--list_presets',
-                        help="(flag) list all the presets that are tested",
-                        action='store_true')
-    parser.add_argument('--stop_after_first_failure',
-                        help="(flag) stop executing tests after the first error",
-                        action='store_true')
-
-    args = parser.parse_args()
-    if args.preset is not None:
-        presets_lists = [args.preset]
-    else:
-        presets_lists = list_all_classes_in_module(presets)
-    win_size = 10
-    fail_count = 0
-    test_count = 0
-    read_csv_tries = 70
-
-    # create a clean experiment directory
-    test_name = '__test'
-    test_path = os.path.join('./experiments', test_name)
-    if path.exists(test_path):
-        shutil.rmtree(test_path)
-    if args.ignore_presets is not None:
-        presets_to_ignore = args.ignore_presets.split(',')
-    else:
-        presets_to_ignore = []
-
-    if args.list_presets:
-        for idx, preset_name in enumerate(presets_lists):
-            preset = eval('presets.{}()'.format(preset_name))
-            if preset.test and preset_name not in presets_to_ignore:
-                print(preset_name)
-        exit(0)
-
-    for idx, preset_name in enumerate(presets_lists):
-        preset = eval('presets.{}()'.format(preset_name))
-        if preset.test and preset_name not in presets_to_ignore:
-            frameworks = []
-            if preset.agent.tensorflow_support and not args.ignore_tensorflow:
-                frameworks.append('tensorflow')
-            if preset.agent.neon_support and not args.ignore_neon:
-                frameworks.append('neon')
-
-            for framework in frameworks:
-                if args.stop_after_first_failure and fail_count > 0:
-                    break
-
-                test_count += 1
-
-                # run the experiment in a separate thread
-                screen.log_title("Running test {} - {}".format(preset_name, framework))
-                log_file_name = 'test_log_{preset_name}_{framework}.txt'.format(
-                    preset_name=preset_name,
-                    framework=framework,
-                )
-                cmd = (
-                    'CUDA_VISIBLE_DEVICES='' python3 coach.py '
-                    '-p {preset_name} '
-                    '-f {framework} '
-                    '-e {test_name} '
-                    '-n {num_workers} '
-                    '-cp "seed=0" '
-                    '&> {log_file_name} '
-                ).format(
-                    preset_name=preset_name,
-                    framework=framework,
-                    test_name=test_name,
-                    num_workers=preset.test_num_workers,
-                    log_file_name=log_file_name,
-                )
-                p = subprocess.Popen(cmd, shell=True, executable="/bin/bash", preexec_fn=os.setsid)
-
-                # get the csv with the results
-                csv_path = None
-                csv_paths = []
-
-                if preset.test_num_workers > 1:
-                    # we have an evaluator
-                    reward_str = 'Evaluation Reward'
-                    filename_pattern = 'evaluator*.csv'
-                else:
-                    reward_str = 'Training Reward'
-                    filename_pattern = 'worker*.csv'
-
-                initialization_error = False
-                test_passed = False
-
-                tries_counter = 0
-                while not csv_paths:
-                    csv_paths = glob.glob(path.join(test_path, '*', filename_pattern))
-                    if tries_counter > read_csv_tries:
-                        break
-                    tries_counter += 1
-                    time.sleep(1)
-
-                if csv_paths:
-                    csv_path = csv_paths[0]
-
-                    # verify results
-                    csv = None
-                    time.sleep(1)
-                    averaged_rewards = [0]
-
-                    last_num_episodes = 0
-                    while csv is None or csv['Episode #'].values[-1] < preset.test_max_step_threshold:
-                        try:
-                            csv = pd.read_csv(csv_path)
-                        except:
-                            # sometimes the csv is being written at the same time we are
-                            # trying to read it. no problem -> try again
-                            continue
-
-                        if reward_str not in csv.keys():
-                            continue
-
-                        rewards = csv[reward_str].values
-                        rewards = rewards[~np.isnan(rewards)]
-
-                        if len(rewards) >= win_size:
-                            averaged_rewards = np.convolve(rewards, np.ones(win_size) / win_size, mode='valid')
-                        else:
-                            time.sleep(1)
-                            continue
-
-                        # print progress
-                        percentage = int((100*last_num_episodes)/preset.test_max_step_threshold)
-                        sys.stdout.write("\rReward: ({}/{})".format(round(averaged_rewards[-1], 1), preset.test_min_return_threshold))
-                        sys.stdout.write(' Episode: ({}/{})'.format(last_num_episodes, preset.test_max_step_threshold))
-                        sys.stdout.write(' {}%|{}{}|  '.format(percentage, '#'*int(percentage/10), ' '*(10-int(percentage/10))))
-                        sys.stdout.flush()
-
-                        if csv['Episode #'].shape[0] - last_num_episodes <= 0:
-                            continue
-
-                        last_num_episodes = csv['Episode #'].values[-1]
-
-                        # check if reward is enough
-                        if np.any(averaged_rewards > preset.test_min_return_threshold):
-                            test_passed = True
-                            break
-                        time.sleep(1)
-
-                # kill test and print result
-                os.killpg(os.getpgid(p.pid), signal.SIGTERM)
-                if test_passed:
-                    screen.success("Passed successfully")
-                else:
-                    if csv_paths:
-                        screen.error("Failed due to insufficient reward", crash=False)
-                        screen.error("preset.test_max_step_threshold: {}".format(preset.test_max_step_threshold), crash=False)
-                        screen.error("preset.test_min_return_threshold: {}".format(preset.test_min_return_threshold), crash=False)
-                        screen.error("averaged_rewards: {}".format(averaged_rewards), crash=False)
-                        screen.error("episode number: {}".format(csv['Episode #'].values[-1]), crash=False)
-                    else:
-                        screen.error("csv file never found", crash=False)
-                        if args.verbose:
-                            screen.error("command exitcode: {}".format(p.returncode), crash=False)
-                            screen.error(open(log_file_name).read(), crash=False)
-
-                    fail_count += 1
-                shutil.rmtree(test_path)
-
-
-    screen.separator()
-    if fail_count == 0:
-        screen.success(" Summary: " + str(test_count) + "/" + str(test_count) + " tests passed successfully")
-    else:
-        screen.error(" Summary: " + str(test_count - fail_count) + "/" + str(test_count) + " tests passed successfully")
diff --git a/setup.py b/setup.py
new file mode 100644
index 0000000..37c20e1
--- /dev/null
+++ b/setup.py
@@ -0,0 +1,80 @@
+#
+# Copyright (c) 2017 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import sys
+
+from codecs import open
+from os import path
+
+from setuptools import setup, find_packages
+import subprocess
+
+# Creating the pip package involves the following steps:
+# - Check that everything works fine by:
+# 1. Create a new virtual environment using `virtualenv coach_env -p python3`
+# 2. Run `pip install -e .`
+# 3. Run `coach -p CartPole_DQN` and make sure it works
+# 4. Run `dashboard` and make sure it works
+
+# - If everything works fine, build and upload the package to PyPi:
+# 1. Update the version of Coach in the call to setup()
+# 2. Remove the directories build, dist and rl_coach.egg-info if they exist
+# 3. Run `python setup.py sdist`
+# 4. Run `twine upload dist/*`
+
+
+here = path.abspath(path.dirname(__file__))
+
+# Get the long description from the README file
+with open(path.join(here, 'README.md'), encoding='utf-8') as f:
+    long_description = f.read()
+
+install_requires=[
+        'annoy==1.8.3', 'Pillow==4.3.0', 'matplotlib==2.0.2', 'numpy==1.14.5', 'pandas==0.22.0',
+        'pygame==1.9.3', 'PyOpenGL==3.1.0', 'scipy==0.19.0', 'scikit-image==0.13.0',
+        'box2d==2.3.2', 'gym==0.10.5', 'gym[atari]==0.10.5', 'bokeh==0.13.0', 'futures==3.1.1', 'wxPython==4.0.1']
+
+# check if system has CUDA enabled GPU
+p = subprocess.Popen(['command -v nvidia-smi'], stdout=subprocess.PIPE, shell=True)
+out = p.communicate()[0].decode('UTF-8')
+using_GPU = out != ''
+
+if not using_GPU:
+    subprocess.check_call(['pip install '
+                           'https://anaconda.org/intel/tensorflow/1.6.0/download/tensorflow-1.6.0-cp35-cp35m-linux_x86_64.whl'],
+                          shell=True)
+    install_requires.append('tensorflow==1.6.0')
+else:
+    install_requires.append('tensorflow-gpu==1.9.0')
+
+setup(
+    name='rl-coach',
+    version='0.10.0',
+    description='Reinforcement Learning Coach enables easy experimentation with state of the art Reinforcement Learning algorithms.',
+    url='https://github.com/NervanaSystems/coach',
+    author='Intel AI Lab',
+    author_email='coach@intel.com',
+    packages=find_packages(),
+    python_requires="==3.5.*",
+    install_requires=install_requires,
+    package_data={'rl_coach': ['dashboard_components/*.css', '*.css', 'environments/*.ini']},
+    entry_points={
+        'console_scripts': [
+            'coach=rl_coach.coach:main',
+            'dashboard=rl_coach.dashboard:main'
+        ],
+    }
+)
diff --git a/tutorials/1. Implementing an Algorithm.ipynb b/tutorials/1. Implementing an Algorithm.ipynb
new file mode 100644
index 0000000..3cc90ca
--- /dev/null
+++ b/tutorials/1. Implementing an Algorithm.ipynb	
@@ -0,0 +1,407 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this tutorial we'll build a new agent that implements the Categorical Deep Q Network algorithm (https://arxiv.org/pdf/1707.06887.pdf), and a preset that runs the agent on the breakout game of the Atari environment."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# The Agent"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We'll start by defining a new head for the neural network used by this algorithm - ```CategoricalQHead```. \n",
+    "\n",
+    "A head is the final part of the network. It takes the embedding from the middleware embedder and passes it through a neural network to produce the output of the network. There can be multiple heads in a network, and each one has an assigned loss function. The heads are algorithm dependent.\n",
+    "\n",
+    "It will be defined in a new file - ```architectures/tensorflow_components/heads/categorical_dqn_head.py```.\n",
+    "\n",
+    "First - some imports."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "import sys\n",
+    "module_path = os.path.abspath(os.path.join('..'))\n",
+    "if module_path not in sys.path:\n",
+    "    sys.path.append(module_path)\n",
+    "\n",
+    "import tensorflow as tf\n",
+    "from rl_coach.architectures.tensorflow_components.heads.head import Head, HeadParameters\n",
+    "from rl_coach.base_parameters import AgentParameters\n",
+    "from rl_coach.core_types import QActionStateValue\n",
+    "from rl_coach.spaces import SpacesDefinition"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's define a class - ```CategoricalQHeadParameters``` - containing the head parameters and the head itself. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class CategoricalQHeadParameters(HeadParameters):\n",
+    "    def __init__(self, activation_function: str ='relu', name: str='categorical_q_head_params'):\n",
+    "        super().__init__(parameterized_class=CategoricalQHead, activation_function=activation_function, name=name)\n",
+    "\n",
+    "class CategoricalQHead(Head):\n",
+    "    def __init__(self, agent_parameters: AgentParameters, spaces: SpacesDefinition, network_name: str,\n",
+    "                 head_idx: int = 0, loss_weight: float = 1., is_local: bool = True, activation_function: str ='relu'):\n",
+    "        super().__init__(agent_parameters, spaces, network_name, head_idx, loss_weight, is_local, activation_function)\n",
+    "        self.name = 'categorical_dqn_head'\n",
+    "        self.num_actions = len(self.spaces.action.actions)\n",
+    "        self.num_atoms = agent_parameters.algorithm.atoms\n",
+    "        self.return_type = QActionStateValue\n",
+    "\n",
+    "    def _build_module(self, input_layer):\n",
+    "        self.actions = tf.placeholder(tf.int32, [None], name=\"actions\")\n",
+    "        self.input = [self.actions]\n",
+    "\n",
+    "        values_distribution = tf.layers.dense(input_layer, self.num_actions * self.num_atoms, name='output')\n",
+    "        values_distribution = tf.reshape(values_distribution, (tf.shape(values_distribution)[0], self.num_actions,\n",
+    "                                                               self.num_atoms))\n",
+    "        # softmax on atoms dimension\n",
+    "        self.output = tf.nn.softmax(values_distribution)\n",
+    "\n",
+    "        # calculate cross entropy loss\n",
+    "        self.distributions = tf.placeholder(tf.float32, shape=(None, self.num_actions, self.num_atoms),\n",
+    "                                            name=\"distributions\")\n",
+    "        self.target = self.distributions\n",
+    "        self.loss = tf.nn.softmax_cross_entropy_with_logits(labels=self.target, logits=values_distribution)\n",
+    "        tf.losses.add_loss(self.loss)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's go ahead and define the network parameters - it will reuse the DQN network parameters but the head parameters will be our ```CategoricalQHeadParameters```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.agents.dqn_agent import DQNNetworkParameters\n",
+    "\n",
+    "\n",
+    "class CategoricalDQNNetworkParameters(DQNNetworkParameters):\n",
+    "    def __init__(self):\n",
+    "        super().__init__()\n",
+    "        self.heads_parameters = [CategoricalQHeadParameters()]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Next we'll define the algorithm parameters, which are the same as the DQN algorithm parameters, with the addition of the Categorical DQN specific v_min, v_max and number of atoms.\n",
+    "We'll also define the parameters of the exploration policy, which is epsilon greedy with epsilon starting at a value of 1.0 and decaying to 0.01 throughout 1,000,000 steps."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.agents.dqn_agent import DQNAlgorithmParameters\n",
+    "from rl_coach.exploration_policies.e_greedy import EGreedyParameters\n",
+    "from rl_coach.schedules import LinearSchedule\n",
+    "\n",
+    "\n",
+    "class CategoricalDQNAlgorithmParameters(DQNAlgorithmParameters):\n",
+    "    def __init__(self):\n",
+    "        super().__init__()\n",
+    "        self.v_min = -10.0\n",
+    "        self.v_max = 10.0\n",
+    "        self.atoms = 51\n",
+    "\n",
+    "\n",
+    "class CategoricalDQNExplorationParameters(EGreedyParameters):\n",
+    "    def __init__(self):\n",
+    "        super().__init__()\n",
+    "        self.epsilon_schedule = LinearSchedule(1, 0.01, 1000000)\n",
+    "        self.evaluation_epsilon = 0.001 "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's define the agent parameters class which contains all the parameters to be used by the agent - the network, algorithm and exploration parameters that we defined above, and also the parameters of the memory module to be used, which is experience replay in this case."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.agents.value_optimization_agent import ValueOptimizationAgent\n",
+    "from rl_coach.base_parameters import AgentParameters\n",
+    "from rl_coach.core_types import StateType\n",
+    "from rl_coach.memories.non_episodic.experience_replay import ExperienceReplayParameters\n",
+    "\n",
+    "\n",
+    "class CategoricalDQNAgentParameters(AgentParameters):\n",
+    "    def __init__(self):\n",
+    "        super().__init__(algorithm=CategoricalDQNAlgorithmParameters(),\n",
+    "                         exploration=CategoricalDQNExplorationParameters(),\n",
+    "                         memory=ExperienceReplayParameters(),\n",
+    "                         networks={\"main\": CategoricalDQNNetworkParameters()})\n",
+    "\n",
+    "    @property\n",
+    "    def path(self):\n",
+    "        return 'agents.categorical_dqn_agent:CategoricalDQNAgent'"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The last step is to define the agent itself - ```CategoricalDQNAgent``` - which is a type of value optimization agent so it will inherit the ```ValueOptimizationAgent``` class. Our agent will implement the ```learn_from_batch``` function which updates the agent's networks according to an input batch of transitions."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Union\n",
+    "\n",
+    "\n",
+    "# Categorical Deep Q Network - https://arxiv.org/pdf/1707.06887.pdf\n",
+    "class CategoricalDQNAgent(ValueOptimizationAgent):\n",
+    "    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):\n",
+    "        super().__init__(agent_parameters, parent)\n",
+    "        self.z_values = np.linspace(self.ap.algorithm.v_min, self.ap.algorithm.v_max, self.ap.algorithm.atoms)\n",
+    "\n",
+    "    def distribution_prediction_to_q_values(self, prediction):\n",
+    "        return np.dot(prediction, self.z_values)\n",
+    "\n",
+    "    # prediction's format is (batch,actions,atoms)\n",
+    "    def get_all_q_values_for_states(self, states: StateType):\n",
+    "        prediction = self.get_prediction(states)\n",
+    "        return self.distribution_prediction_to_q_values(prediction)\n",
+    "\n",
+    "    def learn_from_batch(self, batch):\n",
+    "        network_keys = self.ap.network_wrappers['main'].input_embedders_parameters.keys()\n",
+    "\n",
+    "        # for the action we actually took, the error is calculated by the atoms distribution\n",
+    "        # for all other actions, the error is 0\n",
+    "        distributed_q_st_plus_1, TD_targets = self.networks['main'].parallel_prediction([\n",
+    "            (self.networks['main'].target_network, batch.next_states(network_keys)),\n",
+    "            (self.networks['main'].online_network, batch.states(network_keys))\n",
+    "        ])\n",
+    "\n",
+    "        # only update the action that we have actually done in this transition\n",
+    "        target_actions = np.argmax(self.distribution_prediction_to_q_values(distributed_q_st_plus_1), axis=1)\n",
+    "        m = np.zeros((self.ap.network_wrappers['main'].batch_size, self.z_values.size))\n",
+    "\n",
+    "        batches = np.arange(self.ap.network_wrappers['main'].batch_size)\n",
+    "        for j in range(self.z_values.size):\n",
+    "            tzj = np.fmax(np.fmin(batch.rewards() +\n",
+    "                                  (1.0 - batch.game_overs()) * self.ap.algorithm.discount * self.z_values[j],\n",
+    "                                  self.z_values[self.z_values.size - 1]),\n",
+    "                          self.z_values[0])\n",
+    "            bj = (tzj - self.z_values[0])/(self.z_values[1] - self.z_values[0])\n",
+    "            u = (np.ceil(bj)).astype(int)\n",
+    "            l = (np.floor(bj)).astype(int)\n",
+    "            m[batches, l] = m[batches, l] + (distributed_q_st_plus_1[batches, target_actions, j] * (u - bj))\n",
+    "            m[batches, u] = m[batches, u] + (distributed_q_st_plus_1[batches, target_actions, j] * (bj - l))\n",
+    "        # total_loss = cross entropy between actual result above and predicted result for the given action\n",
+    "        TD_targets[batches, batch.actions()] = m\n",
+    "\n",
+    "        result = self.networks['main'].train_and_sync_networks(batch.states(network_keys), TD_targets)\n",
+    "        total_loss, losses, unclipped_grads = result[:3]\n",
+    "\n",
+    "        return total_loss, losses, unclipped_grads"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# The Preset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The new preset will be defined in a new file - ```presets/atari_categorical_dqn.py```.\n",
+    "\n",
+    "\n",
+    "First - let's define the agent parameters"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.agents.categorical_dqn_agent import CategoricalDQNAgentParameters\n",
+    "\n",
+    "\n",
+    "agent_params = CategoricalDQNAgentParameters()\n",
+    "agent_params.network_wrappers['main'].learning_rate = 0.00025"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Environment parameters"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.environments.gym_environment import Atari, atari_deterministic_v4\n",
+    "from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection\n",
+    "\n",
+    "\n",
+    "env_params = Atari()\n",
+    "env_params.level = SingleLevelSelection(atari_deterministic_v4)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Schedule and visualization parameters"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.graph_managers.graph_manager import ScheduleParameters\n",
+    "from rl_coach.core_types import EnvironmentSteps, RunPhase\n",
+    "from rl_coach.base_parameters import VisualizationParameters\n",
+    "\n",
+    "\n",
+    "schedule_params = ScheduleParameters()\n",
+    "schedule_params.improve_steps = EnvironmentSteps(50000000)\n",
+    "schedule_params.steps_between_evaluation_periods = EnvironmentSteps(250000)\n",
+    "schedule_params.evaluation_steps = EnvironmentSteps(135000)\n",
+    "schedule_params.heatup_steps = EnvironmentSteps(50000)\n",
+    "\n",
+    "vis_params = VisualizationParameters()\n",
+    "vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]\n",
+    "vis_params.dump_mp4 = False"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Connecting all the dots together - we'll define a graph manager with the Categorial DQN agent parameters, the Atari environment parameters, and the scheduling and visualization parameters defined above"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager\n",
+    "\n",
+    "\n",
+    "graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,\n",
+    "                                    schedule_params=schedule_params, vis_params=vis_params)\n",
+    "graph_manager.env_params.level.select('breakout')\n",
+    "graph_manager.visualization_parameters.render = True"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Running the Preset\n",
+    "(this is normally done from command line by running ```python coach.py -p atari_categorical_dqn ... ```)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.base_parameters import TaskParameters, Frameworks\n",
+    "\n",
+    "log_path = '../experiments/atari_categorical_dqn'\n",
+    "if not os.path.exists(log_path):\n",
+    "    os.makedirs(log_path)\n",
+    "    \n",
+    "task_parameters = TaskParameters(framework_type=\"tensorflow\", \n",
+    "                                evaluate_only=False,\n",
+    "                                experiment_path=log_path)\n",
+    "\n",
+    "task_parameters.__dict__['save_checkpoint_secs'] = None\n",
+    "\n",
+    "graph_manager.create_graph(task_parameters)\n",
+    "\n",
+    "# let the adventure begin\n",
+    "graph_manager.improve()\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/tutorials/2. Adding an Environment.ipynb b/tutorials/2. Adding an Environment.ipynb
new file mode 100644
index 0000000..95a1582
--- /dev/null
+++ b/tutorials/2. Adding an Environment.ipynb	
@@ -0,0 +1,386 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this tutorial we'll add the DeepMind Control Suite environment to Coach, and create a preset that trains the DDPG agent on the new environment."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Setup\n",
+    "First, follow the installation instructions here: https://github.com/deepmind/dm_control#installation-and-requirements. \n",
+    "\n",
+    "\n",
+    "Make sure your ```LD_LIBRARY_PATH``` contains the path to the GLEW and LGFW libraries (https://github.com/openai/mujoco-py/issues/110).\n",
+    "\n",
+    "\n",
+    "In addition, Mujoco rendering might need to be disabled (https://github.com/deepmind/dm_control/issues/20)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "os.environ['DISABLE_MUJOCO_RENDERING'] = '1'\n",
+    "\n",
+    "import sys\n",
+    "module_path = os.path.abspath(os.path.join('..'))\n",
+    "if module_path not in sys.path:\n",
+    "    sys.path.append(module_path)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# The Environment Wrapper\n",
+    "\n",
+    "To integrate an environment with Coach, we need to implement an environment wrapper which is placed under the environments folder. In our case, we'll implement the ```control_suite_environment.py``` file.\n",
+    "\n",
+    "\n",
+    "We'll start with some helper classes - ```ObservationType``` and ```ControlSuiteEnvironmentParameters```."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from enum import Enum\n",
+    "from dm_control import suite\n",
+    "from rl_coach.environments.environment import Environment, EnvironmentParameters, LevelSelection\n",
+    "from rl_coach.filters.filter import NoInputFilter, NoOutputFilter\n",
+    "\n",
+    "\n",
+    "\n",
+    "class ObservationType(Enum):\n",
+    "    Measurements = 1\n",
+    "    Image = 2\n",
+    "    Image_and_Measurements = 3\n",
+    "\n",
+    "\n",
+    "# Parameters\n",
+    "class ControlSuiteEnvironmentParameters(EnvironmentParameters):\n",
+    "    def __init__(self):\n",
+    "        super().__init__()\n",
+    "        self.observation_type = ObservationType.Measurements\n",
+    "        self.default_input_filter = ControlSuiteInputFilter\n",
+    "        self.default_output_filter = ControlSuiteOutputFilter\n",
+    "\n",
+    "    @property\n",
+    "    def path(self):\n",
+    "        return 'environments.control_suite_environment:ControlSuiteEnvironment'\n",
+    "\n",
+    "\n",
+    "\"\"\"\n",
+    "ControlSuite Environment Components\n",
+    "\"\"\"\n",
+    "ControlSuiteInputFilter = NoInputFilter()\n",
+    "ControlSuiteOutputFilter = NoOutputFilter()\n",
+    "\n",
+    "control_suite_envs = {':'.join(env): ':'.join(env) for env in suite.BENCHMARKING}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's define the control suite's environment wrapper class.\n",
+    "\n",
+    "In the ```__init__``` function we'll load and initialize the environment, and the internal state and action space members which will make sure the states and actions are within their allowed limits."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import random\n",
+    "from typing import Union\n",
+    "from rl_coach.base_parameters import VisualizationParameters\n",
+    "from rl_coach.spaces import BoxActionSpace, ImageObservationSpace, VectorObservationSpace, StateSpace\n",
+    "from dm_control.suite.wrappers import pixels\n",
+    "\n",
+    "\n",
+    "# Environment\n",
+    "class ControlSuiteEnvironment(Environment):\n",
+    "    def __init__(self, level: LevelSelection, frame_skip: int, visualization_parameters: VisualizationParameters,\n",
+    "                 seed: Union[None, int]=None, human_control: bool=False,\n",
+    "                 observation_type: ObservationType=ObservationType.Measurements,\n",
+    "                 custom_reward_threshold: Union[int, float]=None, **kwargs):\n",
+    "        super().__init__(level, seed, frame_skip, human_control, custom_reward_threshold, visualization_parameters)\n",
+    "\n",
+    "        self.observation_type = observation_type\n",
+    "\n",
+    "        # load and initialize environment\n",
+    "        domain_name, task_name = self.env_id.split(\":\")\n",
+    "        self.env = suite.load(domain_name=domain_name, task_name=task_name)\n",
+    "\n",
+    "        if observation_type != ObservationType.Measurements:\n",
+    "            self.env = pixels.Wrapper(self.env, pixels_only=observation_type == ObservationType.Image)\n",
+    "\n",
+    "        # seed\n",
+    "        if self.seed is not None:\n",
+    "            np.random.seed(self.seed)\n",
+    "            random.seed(self.seed)\n",
+    "\n",
+    "        self.state_space = StateSpace({})\n",
+    "\n",
+    "        # image observations\n",
+    "        if observation_type != ObservationType.Measurements:\n",
+    "            self.state_space['pixels'] = ImageObservationSpace(shape=self.env.observation_spec()['pixels'].shape,\n",
+    "                                                               high=255)\n",
+    "\n",
+    "        # measurements observations\n",
+    "        if observation_type != ObservationType.Image:\n",
+    "            measurements_space_size = 0\n",
+    "            measurements_names = []\n",
+    "            for observation_space_name, observation_space in self.env.observation_spec().items():\n",
+    "                if len(observation_space.shape) == 0:\n",
+    "                    measurements_space_size += 1\n",
+    "                    measurements_names.append(observation_space_name)\n",
+    "                elif len(observation_space.shape) == 1:\n",
+    "                    measurements_space_size += observation_space.shape[0]\n",
+    "                    measurements_names.extend([\"{}_{}\".format(observation_space_name, i) for i in\n",
+    "                                               range(observation_space.shape[0])])\n",
+    "            self.state_space['measurements'] = VectorObservationSpace(shape=measurements_space_size,\n",
+    "                                                                      measurements_names=measurements_names)\n",
+    "\n",
+    "        # actions\n",
+    "        self.action_space = BoxActionSpace(\n",
+    "            shape=self.env.action_spec().shape[0],\n",
+    "            low=self.env.action_spec().minimum,\n",
+    "            high=self.env.action_spec().maximum\n",
+    "        )\n",
+    "\n",
+    "        # initialize the state by getting a new state from the environment\n",
+    "        self.reset_internal_state(True)\n",
+    "\n",
+    "        # render\n",
+    "        if self.is_rendered:\n",
+    "            image = self.get_rendered_image()\n",
+    "            scale = 1\n",
+    "            if self.human_control:\n",
+    "                scale = 2\n",
+    "            if not self.native_rendering:\n",
+    "                self.renderer.create_screen(image.shape[1]*scale, image.shape[0]*scale)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The following functions cover the API expected from a new environment wrapper:\n",
+    "\n",
+    "1. ```_update_state``` - update the internal state of the wrapper (to be queried by the agent)\n",
+    "2. ```_take_action``` - take an action on the environment \n",
+    "3. ```_restart_environment_episode``` - restart the environment on a new episode \n",
+    "4. ```get_rendered_image``` - get a rendered image of the environment in its current state"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class ControlSuiteEnvironment(Environment):\n",
+    "    def _update_state(self):\n",
+    "        self.state = {}\n",
+    "\n",
+    "        if self.observation_type != ObservationType.Measurements:\n",
+    "            self.pixels = self.last_result.observation['pixels']\n",
+    "            self.state['pixels'] = self.pixels\n",
+    "\n",
+    "        if self.observation_type != ObservationType.Image:\n",
+    "            self.measurements = np.array([])\n",
+    "            for sub_observation in self.last_result.observation.values():\n",
+    "                if isinstance(sub_observation, np.ndarray) and len(sub_observation.shape) == 1:\n",
+    "                    self.measurements = np.concatenate((self.measurements, sub_observation))\n",
+    "                else:\n",
+    "                    self.measurements = np.concatenate((self.measurements, np.array([sub_observation])))\n",
+    "            self.state['measurements'] = self.measurements\n",
+    "\n",
+    "        self.reward = self.last_result.reward if self.last_result.reward is not None else 0\n",
+    "\n",
+    "        self.done = self.last_result.last()\n",
+    "\n",
+    "    def _take_action(self, action):\n",
+    "        if type(self.action_space) == BoxActionSpace:\n",
+    "            action = self.action_space.clip_action_to_space(action)\n",
+    "\n",
+    "        self.last_result = self.env.step(action)\n",
+    "\n",
+    "    def _restart_environment_episode(self, force_environment_reset=False):\n",
+    "        self.last_result = self.env.reset()\n",
+    "\n",
+    "    def get_rendered_image(self):\n",
+    "        return self.env.physics.render(camera_id=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# The Preset\n",
+    "The new preset will be defined in a new file - ```presets\\ControlSuite_DDPG.py```. \n",
+    "\n",
+    "First - let's define the agent parameters"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.agents.ddpg_agent import DDPGAgentParameters\n",
+    "from rl_coach.architectures.tensorflow_components.architecture import Dense\n",
+    "from rl_coach.base_parameters import VisualizationParameters, EmbedderScheme\n",
+    "from rl_coach.core_types import TrainingSteps, EnvironmentEpisodes, EnvironmentSteps, RunPhase\n",
+    "from rl_coach.environments.gym_environment import MujocoInputFilter\n",
+    "from rl_coach.filters.reward.reward_rescale_filter import RewardRescaleFilter\n",
+    "\n",
+    "\n",
+    "agent_params = DDPGAgentParameters()\n",
+    "agent_params.network_wrappers['actor'].input_embedders_parameters['measurements'] = \\\n",
+    "    agent_params.network_wrappers['actor'].input_embedders_parameters.pop('observation')\n",
+    "agent_params.network_wrappers['critic'].input_embedders_parameters['measurements'] = \\\n",
+    "    agent_params.network_wrappers['critic'].input_embedders_parameters.pop('observation')\n",
+    "agent_params.network_wrappers['actor'].input_embedders_parameters['measurements'].scheme = [Dense([300])]\n",
+    "agent_params.network_wrappers['actor'].middleware_parameters.scheme = [Dense([200])]\n",
+    "agent_params.network_wrappers['critic'].input_embedders_parameters['measurements'].scheme = [Dense([400])]\n",
+    "agent_params.network_wrappers['critic'].middleware_parameters.scheme = [Dense([300])]\n",
+    "agent_params.network_wrappers['critic'].input_embedders_parameters['action'].scheme = EmbedderScheme.Empty\n",
+    "agent_params.input_filter = MujocoInputFilter()\n",
+    "agent_params.input_filter.add_reward_filter(\"rescale\", RewardRescaleFilter(1/10.))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's define the environment parameters"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.environments.control_suite_environment import ControlSuiteEnvironmentParameters, control_suite_envs\n",
+    "from rl_coach.environments.environment import MaxDumpMethod, SelectedPhaseOnlyDumpMethod, SingleLevelSelection\n",
+    "\n",
+    "env_params = ControlSuiteEnvironmentParameters()\n",
+    "env_params.level = SingleLevelSelection(control_suite_envs)\n",
+    "\n",
+    "vis_params = VisualizationParameters()\n",
+    "vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST), MaxDumpMethod()]\n",
+    "vis_params.dump_mp4 = False"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The schedule parameters will define the number of heatup steps, periodice evaluation steps, training steps between evaluations."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.graph_managers.graph_manager import ScheduleParameters\n",
+    "\n",
+    "\n",
+    "schedule_params = ScheduleParameters()\n",
+    "schedule_params.improve_steps = TrainingSteps(10000000000)\n",
+    "schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(20)\n",
+    "schedule_params.evaluation_steps = EnvironmentEpisodes(1)\n",
+    "schedule_params.heatup_steps = EnvironmentSteps(1000)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Finally, we'll create and run the graph manager"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager\n",
+    "from rl_coach.base_parameters import TaskParameters, Frameworks\n",
+    "\n",
+    "\n",
+    "graph_manager = BasicRLGraphManager(agent_params=agent_params, env_params=env_params,\n",
+    "                                    schedule_params=schedule_params, vis_params=vis_params)\n",
+    "\n",
+    "graph_manager.env_params.level.select('walker:walk')\n",
+    "#graph_manager.visualization_parameters.render = True\n",
+    "\n",
+    "\n",
+    "log_path = '../experiments/control_suite_walker_ddpg'\n",
+    "if not os.path.exists(log_path):\n",
+    "    os.makedirs(log_path)\n",
+    "    \n",
+    "task_parameters = TaskParameters(framework_type=\"tensorflow\", \n",
+    "                                evaluate_only=False,\n",
+    "                                experiment_path=log_path)\n",
+    "\n",
+    "task_parameters.__dict__['save_checkpoint_secs'] = None\n",
+    "\n",
+    "\n",
+    "graph_manager.create_graph(task_parameters)\n",
+    "\n",
+    "# let the adventure begin\n",
+    "graph_manager.improve()\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/tutorials/3. Implementing a Hierarchical RL Graph.ipynb b/tutorials/3. Implementing a Hierarchical RL Graph.ipynb
new file mode 100644
index 0000000..d2dff6d
--- /dev/null
+++ b/tutorials/3. Implementing a Hierarchical RL Graph.ipynb	
@@ -0,0 +1,410 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this tutorial we'll demonstrate Coach's hierarchical RL support, by building a new agent that implements the Hierarchical Actor Critic (HAC) algorithm (https://arxiv.org/pdf/1712.00948.pdf), and a preset that runs the agent on Mujoco's pendulum challenge."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# The Agent"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "First, some imports. Note that HAC is based on DDPG, hence we will be importing the relevant classes.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "import sys\n",
+    "module_path = os.path.abspath(os.path.join('..'))\n",
+    "if module_path not in sys.path:\n",
+    "    sys.path.append(module_path)\n",
+    "    sys.path.append(module_path + '/rl_coach')\n",
+    "    \n",
+    "from typing import Union\n",
+    "import numpy as np\n",
+    "from rl_coach.agents.ddpg_agent import DDPGAgent, DDPGAgentParameters, DDPGAlgorithmParameters\n",
+    "from rl_coach.spaces import SpacesDefinition\n",
+    "from rl_coach.core_types import RunPhase"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's define the HAC algorithm and agent parameters.\n",
+    "\n",
+    "See tutorial 1 for more details on the content of each of these classes."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class HACDDPGAlgorithmParameters(DDPGAlgorithmParameters):\n",
+    "    def __init__(self):\n",
+    "        super().__init__()\n",
+    "        self.sub_goal_testing_rate = 0.5\n",
+    "        self.time_limit = 40\n",
+    "\n",
+    "\n",
+    "class HACDDPGAgentParameters(DDPGAgentParameters):\n",
+    "    def __init__(self):\n",
+    "        super().__init__()\n",
+    "        self.algorithm = DDPGAlgorithmParameters()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we'll define the agent itself - ```HACDDPGAgent``` - which subclasses the DDPG agent class. The main difference between the DDPG agent and the HACDDPGAgent is the subgoal a higher level agent defines to a lower level agent, hence the overrides of the DDPG Agent functions."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class HACDDPGAgent(DDPGAgent):\n",
+    "    def __init__(self, agent_parameters, parent: Union['LevelManager', 'CompositeAgent']=None):\n",
+    "        super().__init__(agent_parameters, parent)\n",
+    "        self.sub_goal_testing_rate = self.ap.algorithm.sub_goal_testing_rate\n",
+    "        self.graph_manager = None\n",
+    "\n",
+    "    def choose_action(self, curr_state):\n",
+    "        # top level decides, for each of his generated sub-goals, for all the layers beneath him if this is a sub-goal\n",
+    "        # testing phase\n",
+    "        graph_manager = self.parent_level_manager.parent_graph_manager\n",
+    "        if self.ap.is_a_highest_level_agent:\n",
+    "            graph_manager.should_test_current_sub_goal = np.random.rand() < self.sub_goal_testing_rate\n",
+    "\n",
+    "        if self.phase == RunPhase.TRAIN:\n",
+    "            if graph_manager.should_test_current_sub_goal:\n",
+    "                self.exploration_policy.change_phase(RunPhase.TEST)\n",
+    "            else:\n",
+    "                self.exploration_policy.change_phase(self.phase)\n",
+    "\n",
+    "        action_info = super().choose_action(curr_state)\n",
+    "        return action_info\n",
+    "\n",
+    "    def update_transition_before_adding_to_replay_buffer(self, transition):\n",
+    "        graph_manager = self.parent_level_manager.parent_graph_manager\n",
+    "\n",
+    "        # deal with goals given from a higher level agent\n",
+    "        if not self.ap.is_a_highest_level_agent:\n",
+    "            transition.state['desired_goal'] = self.current_hrl_goal\n",
+    "            transition.next_state['desired_goal'] = self.current_hrl_goal\n",
+    "            self.distance_from_goal.add_sample(self.spaces.goal.distance_from_goal(\n",
+    "                self.current_hrl_goal, transition.next_state))\n",
+    "            goal_reward, sub_goal_reached = self.spaces.goal.get_reward_for_goal_and_state(\n",
+    "                self.current_hrl_goal, transition.next_state)\n",
+    "            transition.reward = goal_reward\n",
+    "            transition.game_over = transition.game_over or sub_goal_reached\n",
+    "\n",
+    "        # each level tests its own generated sub goals\n",
+    "        if not self.ap.is_a_lowest_level_agent and graph_manager.should_test_current_sub_goal:\n",
+    "            _, sub_goal_reached = self.spaces.goal.get_reward_for_goal_and_state(\n",
+    "                transition.action, transition.next_state)\n",
+    "\n",
+    "            sub_goal_is_missed = not sub_goal_reached\n",
+    "\n",
+    "            if sub_goal_is_missed:\n",
+    "                    transition.reward = -self.ap.algorithm.time_limit\n",
+    "        return transition\n",
+    "\n",
+    "    def set_environment_parameters(self, spaces: SpacesDefinition):\n",
+    "        super().set_environment_parameters(spaces)\n",
+    "\n",
+    "        if self.ap.is_a_highest_level_agent:\n",
+    "            # the rest of the levels already have an in_action_space set to be of type GoalsSpace, thus they will have\n",
+    "            # their GoalsSpace set to the in_action_space in agent.set_environment_parameters()\n",
+    "            self.spaces.goal = self.spaces.action\n",
+    "            self.spaces.goal.set_target_space(self.spaces.state[self.spaces.goal.goal_name])\n",
+    "\n",
+    "        if not self.ap.is_a_highest_level_agent:\n",
+    "            self.spaces.reward.reward_success_threshold = self.spaces.goal.reward_type.goal_reaching_reward\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# The Preset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Defining the top agent in the hierarchy. Note that the agent's base parameters are the same as the DDPG agent's parameters. We also define here the memory, exploration policy and network topology."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.architectures.tensorflow_components.architecture import Dense\n",
+    "from rl_coach.base_parameters import VisualizationParameters, EmbeddingMergerType, EmbedderScheme, InputEmbedderParameters\n",
+    "from rl_coach.memories.episodic.episodic_hindsight_experience_replay import HindsightGoalSelectionMethod, \\\n",
+    "    EpisodicHindsightExperienceReplayParameters\n",
+    "from rl_coach.memories.episodic.episodic_hrl_hindsight_experience_replay import \\\n",
+    "    EpisodicHRLHindsightExperienceReplayParameters\n",
+    "from rl_coach.memories.memory import MemoryGranularity\n",
+    "from rl_coach.spaces import GoalsSpace, ReachingGoal\n",
+    "from rl_coach.exploration_policies.ou_process import OUProcessParameters\n",
+    "from rl_coach.core_types import EnvironmentEpisodes, EnvironmentSteps, RunPhase, TrainingSteps\n",
+    "\n",
+    "\n",
+    "time_limit = 1000\n",
+    "polar_coordinates = False\n",
+    "distance_from_goal_threshold = np.array([0.075, 0.075, 0.75])\n",
+    "goals_space = GoalsSpace('achieved_goal',\n",
+    "                         ReachingGoal(default_reward=-1, goal_reaching_reward=0,\n",
+    "                                      distance_from_goal_threshold=distance_from_goal_threshold),\n",
+    "                         lambda goal, state: np.abs(goal - state))  # raw L1 distance\n",
+    "\n",
+    "top_agent_params = HACDDPGAgentParameters()\n",
+    "\n",
+    "# memory - Hindsight Experience Replay\n",
+    "top_agent_params.memory = EpisodicHRLHindsightExperienceReplayParameters()\n",
+    "top_agent_params.memory.max_size = (MemoryGranularity.Transitions, 10000000)\n",
+    "top_agent_params.memory.hindsight_transitions_per_regular_transition = 3\n",
+    "top_agent_params.memory.hindsight_goal_selection_method = HindsightGoalSelectionMethod.Future\n",
+    "top_agent_params.memory.goals_space = goals_space\n",
+    "top_agent_params.algorithm.num_consecutive_playing_steps = EnvironmentEpisodes(32)\n",
+    "top_agent_params.algorithm.num_consecutive_training_steps = 40\n",
+    "top_agent_params.algorithm.num_steps_between_copying_online_weights_to_target = TrainingSteps(40)\n",
+    "\n",
+    "# exploration - OU process\n",
+    "top_agent_params.exploration = OUProcessParameters()\n",
+    "top_agent_params.exploration.theta = 0.1\n",
+    "\n",
+    "# actor - note that the default middleware is overriden with 3 dense layers\n",
+    "top_actor = top_agent_params.network_wrappers['actor']\n",
+    "top_actor.input_embedders_parameters = {'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty),\n",
+    "                                        'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)}\n",
+    "top_actor.middleware_parameters.scheme = [Dense([64])] * 3\n",
+    "top_actor.learning_rate = 0.001\n",
+    "top_actor.batch_size = 4096\n",
+    "\n",
+    "# critic - note that the default middleware is overriden with 3 dense layers\n",
+    "top_critic = top_agent_params.network_wrappers['critic']\n",
+    "top_critic.input_embedders_parameters = {'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty),\n",
+    "                                         'action': InputEmbedderParameters(scheme=EmbedderScheme.Empty),\n",
+    "                                         'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)}\n",
+    "top_critic.embedding_merger_type = EmbeddingMergerType.Concat\n",
+    "top_critic.middleware_parameters.scheme = [Dense([64])] * 3\n",
+    "top_critic.learning_rate = 0.001\n",
+    "top_critic.batch_size = 4096"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The bottom agent"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.schedules import ConstantSchedule\n",
+    "from rl_coach.exploration_policies.e_greedy import EGreedyParameters\n",
+    "\n",
+    "\n",
+    "bottom_agent_params = HACDDPGAgentParameters()\n",
+    "bottom_agent_params.algorithm.in_action_space = goals_space\n",
+    "\n",
+    "bottom_agent_params.memory = EpisodicHindsightExperienceReplayParameters()\n",
+    "bottom_agent_params.memory.max_size = (MemoryGranularity.Transitions, 12000000)\n",
+    "bottom_agent_params.memory.hindsight_transitions_per_regular_transition = 4\n",
+    "bottom_agent_params.memory.hindsight_goal_selection_method = HindsightGoalSelectionMethod.Future\n",
+    "bottom_agent_params.memory.goals_space = goals_space\n",
+    "bottom_agent_params.algorithm.num_consecutive_playing_steps = EnvironmentEpisodes(16 * 25)  # 25 episodes is one true env episode\n",
+    "bottom_agent_params.algorithm.num_consecutive_training_steps = 40\n",
+    "bottom_agent_params.algorithm.num_steps_between_copying_online_weights_to_target = TrainingSteps(40)\n",
+    "\n",
+    "bottom_agent_params.exploration = EGreedyParameters()\n",
+    "bottom_agent_params.exploration.epsilon_schedule = ConstantSchedule(0.2)\n",
+    "bottom_agent_params.exploration.evaluation_epsilon = 0\n",
+    "bottom_agent_params.exploration.continuous_exploration_policy_parameters = OUProcessParameters()\n",
+    "bottom_agent_params.exploration.continuous_exploration_policy_parameters.theta = 0.1\n",
+    "\n",
+    "# actor\n",
+    "bottom_actor = bottom_agent_params.network_wrappers['actor']\n",
+    "bottom_actor.input_embedders_parameters = {'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty),\n",
+    "                                           'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)}\n",
+    "bottom_actor.middleware_parameters.scheme = [Dense([64])] * 3\n",
+    "bottom_actor.learning_rate = 0.001\n",
+    "bottom_actor.batch_size = 4096\n",
+    "\n",
+    "# critic\n",
+    "bottom_critic = bottom_agent_params.network_wrappers['critic']\n",
+    "bottom_critic.input_embedders_parameters = {'observation': InputEmbedderParameters(scheme=EmbedderScheme.Empty),\n",
+    "                                            'action': InputEmbedderParameters(scheme=EmbedderScheme.Empty),\n",
+    "                                            'desired_goal': InputEmbedderParameters(scheme=EmbedderScheme.Empty)}\n",
+    "bottom_critic.embedding_merger_type = EmbeddingMergerType.Concat\n",
+    "bottom_critic.middleware_parameters.scheme = [Dense([64])] * 3\n",
+    "bottom_critic.learning_rate = 0.001\n",
+    "bottom_critic.batch_size = 4096"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we define the parameters of all the agents in the hierarchy from top to bottom"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "agents_params = [top_agent_params, bottom_agent_params]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Define the environment, visualization and schedule parameters. The schedule parameters refer to the top level agent."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from rl_coach.environments.gym_environment import Mujoco\n",
+    "from rl_coach.environments.environment import SelectedPhaseOnlyDumpMethod\n",
+    "from rl_coach.graph_managers.hrl_graph_manager import HRLGraphManager\n",
+    "from rl_coach.graph_managers.graph_manager import ScheduleParameters\n",
+    "\n",
+    "\n",
+    "env_params = Mujoco()\n",
+    "env_params.level = \"rl_coach.environments.mujoco.pendulum_with_goals:PendulumWithGoals\"\n",
+    "env_params.additional_simulator_parameters = {\"time_limit\": time_limit,\n",
+    "                                              \"random_goals_instead_of_standing_goal\": False,\n",
+    "                                              \"polar_coordinates\": polar_coordinates,\n",
+    "                                              \"goal_reaching_thresholds\": distance_from_goal_threshold}\n",
+    "env_params.frame_skip = 10\n",
+    "env_params.custom_reward_threshold = -time_limit + 1\n",
+    "\n",
+    "vis_params = VisualizationParameters()\n",
+    "vis_params.video_dump_methods = [SelectedPhaseOnlyDumpMethod(RunPhase.TEST)]\n",
+    "vis_params.dump_mp4 = False\n",
+    "vis_params.native_rendering = False\n",
+    "\n",
+    "schedule_params = ScheduleParameters()\n",
+    "schedule_params.improve_steps = EnvironmentEpisodes(40 * 4 * 64)  # 40 epochs\n",
+    "schedule_params.steps_between_evaluation_periods = EnvironmentEpisodes(4 * 64)  # 4 small batches of 64 episodes\n",
+    "schedule_params.evaluation_steps = EnvironmentEpisodes(64)\n",
+    "schedule_params.heatup_steps = EnvironmentSteps(0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Lastly, we create a ```HRLGraphManager``` that will execute the hierarchical agent we defined according to the parameters. \n",
+    "\n",
+    "Note that the bottom level agent will run 40 steps on each single step of the top level agent."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "graph_manager = HRLGraphManager(agents_params=agents_params, env_params=env_params,\n",
+    "                                schedule_params=schedule_params, vis_params=vis_params,\n",
+    "                                consecutive_steps_to_run_each_level=EnvironmentSteps(40))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Running the Preset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from base_parameters import TaskParameters, Frameworks\n",
+    "\n",
+    "log_path = '../experiments/pendulum_hac'\n",
+    "if not os.path.exists(log_path):\n",
+    "    os.makedirs(log_path)\n",
+    "    \n",
+    "task_parameters = TaskParameters(framework_type=\"tensorflow\", \n",
+    "                                evaluate_only=False,\n",
+    "                                experiment_path=log_path)\n",
+    "\n",
+    "task_parameters.__dict__['save_checkpoint_secs'] = None\n",
+    "task_parameters.__dict__['verbosity'] = 'low'\n",
+    "\n",
+    "graph_manager.create_graph(task_parameters)\n",
+    "\n",
+    "graph_manager.improve()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}