mirror of
https://github.com/gryf/coach.git
synced 2026-01-05 21:34:18 +01:00
2.2 KiB
2.2 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Discounted Return/Mean | Discounted Return/Stdev | Discounted Return/Max | Discounted Return/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 1117.0 | 1117.0 | 1117.0 | 1117.0 | 1.0 | 0.0 | -1.5180229894995567 | 0.6998808293377133 | -0.08930329112720292 | -3.148474706421977 | |||||||||||||||||||||
| 3 | 2 | 205.0 | 0.0 | 1937.0 | 1937.0 | 820.0 | 1937.0 | 0.9992620000000244 | -21.0 | -21.0 | 0.0 | 0.011032734143000421 | 0.013050631943252157 | 0.06188610941171646 | 5.788924681837671e-05 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.09481886 | 0.06547235 | 0.4128104 | 0.014912024 | -2.3361342922088504 | 0.784322378590693 | -0.38878391807422696 | -3.369599601005491 | |||||||
| 4 | 3 | 413.0 | 0.0 | 2768.0 | 2768.0 | 831.0 | 2768.0 | 0.9985141000000488 | -21.0 | -21.0 | 0.0 | 0.011714245055950064 | 0.013615523545387836 | 0.08608928322792052 | 4.0108770917868235e-05 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.078761965 | 0.042246383 | 0.2695421 | 0.010716428 | -2.320394201181889 | 0.6047235028955231 | -0.7105532272722921 | -3.350537576335216 | 0.13584568 | 0.011556746000000001 | 0.14892316 | 0.11885095400000001 | |||
| 5 | 4 | 667.0 | 0.0 | 3783.0 | 3783.0 | 1015.0 | 3783.0 | 0.9976006000000791 | -20.0 | -20.0 | 0.0 | 0.011432893811852264 | 0.012069273288368423 | 0.050671517848968506 | 6.866654439363629e-05 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.06967258 | 0.03742053 | 0.20260468 | 0.011122073 | -1.7531357837449677 | 0.7448577440634202 | -0.1288331810939122 | -3.2971074888190803 | 0.12361515 | 0.012860634 | 0.14413264 | 0.10136055 | |||
| 6 | 5 | 947.0 | 0.0 | 4906.0 | 4906.0 | 1123.0 | 4906.0 | 0.9965899000001124 | -18.0 | -18.0 | 0.0 | 0.010396778351917291 | 0.011949964558787713 | 0.06747313588857651 | 4.6792134526185685e-05 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.062268693 | 0.038279983999999996 | 0.35579062 | 0.008977733 | -1.5284306594746384 | 1.0386188434297503 | 0.5895609917959839 | -3.3469363938389924 | 0.10764929999999999 | 0.011775569 | 0.123398446 | 0.085828155 |