1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-17 11:10:20 +01:00
Files
coach/docs/mkdocs.yml
Itai Caspi 125c7ee38d Release 0.9
Main changes are detailed below:

New features -
* CARLA 0.7 simulator integration
* Human control of the game play
* Recording of human game play and storing / loading the replay buffer
* Behavioral cloning agent and presets
* Golden tests for several presets
* Selecting between deep / shallow image embedders
* Rendering through pygame (with some boost in performance)

API changes -
* Improved environment wrapper API
* Added an evaluate flag to allow convenient evaluation of existing checkpoints
* Improve frameskip definition in Gym

Bug fixes -
* Fixed loading of checkpoints for agents with more than one network
* Fixed the N Step Q learning agent python3 compatibility
2017-12-19 19:27:16 +02:00

39 lines
1.8 KiB
YAML

site_name: Reinforcement Learning Coach Documentation
theme: readthedocs
site_description: 'Reinforcement Learning Coach Documentation by Intel Nervana.'
markdown_extensions:
- mdx_math:
enable_dollar_delimiter: True #for use of inline $..$
extra_javascript: ['https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML']
extra_css: [extra.css]
pages:
- Home : index.md
- Design: design.md
- Usage: usage.md
- Algorithms:
- 'DQN' : algorithms/value_optimization/dqn.md
- 'Double DQN' : algorithms/value_optimization/double_dqn.md
- 'Dueling DQN' : algorithms/value_optimization/dueling_dqn.md
- 'Categorical DQN' : algorithms/value_optimization/categorical_dqn.md
- 'Mixed Monte Carlo' : algorithms/value_optimization/mmc.md
- 'Persistent Advantage Learning' : algorithms/value_optimization/pal.md
- 'Neural Episodic Control' : algorithms/value_optimization/nec.md
- 'Bootstrapped DQN' : algorithms/value_optimization/bs_dqn.md
- 'N-Step Q Learning' : algorithms/value_optimization/n_step.md
- 'Normalized Advantage Functions' : algorithms/value_optimization/naf.md
- 'Policy Gradient' : algorithms/policy_optimization/pg.md
- 'Actor-Critic' : algorithms/policy_optimization/ac.md
- 'Deep Determinstic Policy Gradients' : algorithms/policy_optimization/ddpg.md
- 'Proximal Policy Optimization' : algorithms/policy_optimization/ppo.md
- 'Clipped Proximal Policy Optimization' : algorithms/policy_optimization/cppo.md
- 'Direct Future Prediction' : algorithms/other/dfp.md
- 'Behavioral Cloning' : algorithms/imitation/bc.md
- Coach Dashboard : 'dashboard.md'
- Contributing :
- Adding a New Agent : 'contributing/add_agent.md'
- Adding a New Environment : 'contributing/add_env.md'