mirror of
https://github.com/gryf/coach.git
synced 2025-12-17 19:20:19 +01:00
updated the paper links in the docs and restyled the theme
This commit is contained in:
committed by
Gal Leibovich
parent
8c708820a9
commit
00fca9b6e0
@@ -1,6 +1,8 @@
|
|||||||
> Actions space: Discrete
|
# Direct Future Prediction
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1611.01779)
|
**Actions space:** Discrete
|
||||||
|
|
||||||
|
**References:** [Learning to Act by Predicting the Future](https://arxiv.org/abs/1611.01779)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action space: Discrete|Continuous
|
# Actor-Critic
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1602.01783)
|
**Actions space:** Discrete|Continuous
|
||||||
|
|
||||||
|
**References:** [Asynchronous Methods for Deep Reinforcement Learning](https://arxiv.org/abs/1602.01783)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
<p style="text-align: center;">
|
<p style="text-align: center;">
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action Space: Discrete|Continuous
|
# Clipped Proximal Policy Optimization
|
||||||
|
|
||||||
[Paper](https://arxiv.org/pdf/1707.06347.pdf)
|
**Actions space:** Discrete|Continuous
|
||||||
|
|
||||||
|
**References:** [Proximal Policy Optimization Algorithms](https://arxiv.org/pdf/1707.06347.pdf)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Actions space: Continuous
|
# Deep Deterministic Policy Gradient
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1509.02971)
|
**Actions space:** Continuous
|
||||||
|
|
||||||
|
**References:** [Continuous control with deep reinforcement learning](https://arxiv.org/abs/1509.02971)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action Space: Discrete|Continuous
|
# Policy Gradient
|
||||||
|
|
||||||
[Paper](http://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf)
|
**Actions space:** Discrete|Continuous
|
||||||
|
|
||||||
|
**References:** [Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning](http://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Actions space: Discrete|Continuous
|
# Proximal Policy Optimization
|
||||||
|
|
||||||
[Paper](https://arxiv.org/pdf/1707.02286.pdf)
|
**Actions space:** Discrete|Continuous
|
||||||
|
|
||||||
|
**References:** [Emergence of Locomotion Behaviours in Rich Environments](https://arxiv.org/pdf/1707.02286.pdf)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action space: Discrete
|
# Bootstrapped DQN
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1602.04621)
|
**Actions space:** Discrete
|
||||||
|
|
||||||
|
**References:** [Deep Exploration via Bootstrapped DQN](https://arxiv.org/abs/1602.04621)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action space: Discrete
|
# Distributional DQN
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1707.06887)
|
**Actions space:** Discrete
|
||||||
|
|
||||||
|
**References:** [A Distributional Perspective on Reinforcement Learning](https://arxiv.org/abs/1707.06887)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,8 +1,8 @@
|
|||||||
# Double DQN
|
# Double DQN
|
||||||
> Action space: Discrete
|
|
||||||
|
|
||||||
[Paper](https://arxiv.org/pdf/1509.06461.pdf)
|
**Actions space:** Discrete
|
||||||
|
|
||||||
|
**References:** [Deep Reinforcement Learning with Double Q-learning](https://arxiv.org/abs/1509.06461.pdf)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action space: Discrete
|
# Deep Q Networks
|
||||||
|
|
||||||
[Paper](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)
|
**Actions space:** Discrete
|
||||||
|
|
||||||
|
**References:** [Playing Atari with Deep Reinforcement Learning](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action space: Discrete
|
# Dueling DQN
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1511.06581)
|
**Actions space:** Discrete
|
||||||
|
|
||||||
|
**References:** [Dueling Network Architectures for Deep Reinforcement Learning](https://arxiv.org/abs/1511.06581)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,8 +1,8 @@
|
|||||||
# Mixed Monte Carlo
|
# Mixed Monte Carlo
|
||||||
|
|
||||||
> Action space: Discrete
|
**Actions space:** Discrete
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1703.01310)
|
**References:** [Count-Based Exploration with Neural Density Models](https://arxiv.org/abs/1703.01310)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action space: Discrete
|
# N-Step Q Learning
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1602.01783)
|
**Actions space:** Discrete
|
||||||
|
|
||||||
|
**References:** [Asynchronous Methods for Deep Reinforcement Learning](https://arxiv.org/abs/1602.01783)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action space: Continuous
|
# Normalized Advantage Functions
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1603.00748.pdf)
|
**Actions space:** Continuous
|
||||||
|
|
||||||
|
**References:** [Continuous Deep Q-Learning with Model-based Acceleration](https://arxiv.org/abs/1603.00748.pdf)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action space: Discrete
|
# Neural Episodic Control
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1703.01988)
|
**Actions space:** Discrete
|
||||||
|
|
||||||
|
**References:** [Neural Episodic Control](https://arxiv.org/abs/1703.01988)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
> Action space: Discrete
|
# Persistent Advantage Learning
|
||||||
|
|
||||||
[Paper](https://arxiv.org/abs/1512.04860)
|
**Actions space:** Discrete
|
||||||
|
|
||||||
|
**References:** [Increasing the Action Gap: New Operators for Reinforcement Learning](https://arxiv.org/abs/1512.04860)
|
||||||
|
|
||||||
## Network Structure
|
## Network Structure
|
||||||
|
|
||||||
|
|||||||
3
docs/docs/extra.css
Normal file
3
docs/docs/extra.css
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
.wy-side-nav-search {
|
||||||
|
background-color: #79a7a5;
|
||||||
|
}
|
||||||
@@ -6,6 +6,7 @@ markdown_extensions:
|
|||||||
enable_dollar_delimiter: True #for use of inline $..$
|
enable_dollar_delimiter: True #for use of inline $..$
|
||||||
|
|
||||||
extra_javascript: ['https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML']
|
extra_javascript: ['https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML']
|
||||||
|
extra_css: [extra.css]
|
||||||
|
|
||||||
pages:
|
pages:
|
||||||
- Home : index.md
|
- Home : index.md
|
||||||
|
|||||||
Reference in New Issue
Block a user