1
0
mirror of https://github.com/gryf/coach.git synced 2026-03-12 20:45:55 +01:00

Docs changes - fixing blogpost links, removing importing all exploration policies (#139)

* updated docs

* removing imports for all exploration policies in __init__ + setting the right blog-post link

* small cleanups
This commit is contained in:
Gal Leibovich
2018-12-05 23:16:16 +02:00
committed by Scott Leishman
parent 155b78b995
commit f12857a8c7
33 changed files with 191 additions and 160 deletions

View File

@@ -222,7 +222,7 @@
<span class="k">return</span> <span class="s1">&#39;rl_coach.exploration_policies.ucb:UCB&#39;</span>
<div class="viewcode-block" id="UCB"><a class="viewcode-back" href="../../../components/exploration_policies/index.html#rl_coach.exploration_policies.UCB">[docs]</a><span class="k">class</span> <span class="nc">UCB</span><span class="p">(</span><span class="n">EGreedy</span><span class="p">):</span>
<div class="viewcode-block" id="UCB"><a class="viewcode-back" href="../../../components/exploration_policies/index.html#rl_coach.exploration_policies.ucb.UCB">[docs]</a><span class="k">class</span> <span class="nc">UCB</span><span class="p">(</span><span class="n">EGreedy</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;</span>
<span class="sd"> UCB exploration policy is following the upper confidence bound heuristic to sample actions in discrete action spaces.</span>
<span class="sd"> It assumes that there are multiple network heads that are predicting action values, and that the standard deviation</span>