mirror of
https://github.com/gryf/coach.git
synced 2025-12-18 11:40:18 +01:00
update of api docstrings across coach and tutorials [WIP] (#91)
* updating the documentation website * adding the built docs * update of api docstrings across coach and tutorials 0-2 * added some missing api documentation * New Sphinx based documentation
This commit is contained in:
22
docs/_sources/features/benchmarks.rst.txt
Normal file
22
docs/_sources/features/benchmarks.rst.txt
Normal file
@@ -0,0 +1,22 @@
|
||||
Benchmarks
|
||||
==========
|
||||
|
||||
Reinforcement learning is a developing field, and so far it has been particularly difficult to reproduce some of the
|
||||
results published in the original papers. Some reasons for this are:
|
||||
|
||||
* Reinforcement learning algorithms are notoriously known as having an unstable learning process.
|
||||
The data the neural networks trains on is dynamic, and depends on the random seed defined for the environment.
|
||||
|
||||
* Reinforcement learning algorithms have many moving parts. For some environments and agents, there are many
|
||||
"tricks" which are needed to get the exact behavior the paper authors had seen. Also, there are **a lot** of
|
||||
hyper-parameters to set.
|
||||
|
||||
In order for a reinforcement learning implementation to be useful for research or for data science, it must be
|
||||
shown that it achieves the expected behavior. For this reason, we collected a set of benchmark results from most
|
||||
of the algorithms implemented in Coach. The algorithms were tested on a subset of the same environments that were
|
||||
used in the original papers, and with multiple seed for each environment.
|
||||
Additionally, Coach uses some strict testing mechanisms to try and make sure the results we show for these
|
||||
benchmarks stay intact as Coach continues to develop.
|
||||
|
||||
To see the benchmark results, please visit the
|
||||
`following GitHub page <https://github.com/NervanaSystems/coach/tree/master/benchmarks>`_.
|
||||
Reference in New Issue
Block a user