mirror of
https://github.com/gryf/coach.git
synced 2026-02-14 04:45:50 +01:00
2.0 KiB
2.0 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Advantages/Mean | Advantages/Stdev | Advantages/Max | Advantages/Min | Values/Mean | Values/Stdev | Values/Max | Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min | Policy Loss/Mean | Policy Loss/Stdev | Policy Loss/Max | Policy Loss/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 772.0 | 1.0 | 772.0 | 772.0 | 0.0 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 821.0 | 1.0 | 821.0 | 1593.0 | 0.0 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 4 | 3 | 40.0 | 0.0 | 798.0 | 1.0 | 798.0 | 2391.0 | 0.0 | -21.0 | -21.0 | 0.0 | 1.6925433 | 1.56166 | 5.4782605 | 0.011157283 | 1.7178073 | 0.09590818 | 1.7916279 | 1.4786046 | -1.1788440822140267 | 1.255206542380211 | 1.0642428398132324 | -3.2777271270751958 | -1.3811463 | 1.9222080000000001 | -0.023936661 | -5.8730206 | 0.34145027 | 0.439121 | 2.7880993 | 0.00036954717000000004 | -0.13185489 | 0.9237273000000001 | 0.9732863000000002 | -4.1131425 | |||||||||||
| 5 | 4 | 78.0 | 0.0 | 755.0 | 1.0 | 755.0 | 3146.0 | 0.0 | -21.0 | -21.0 | 0.0 | 2.4706511000000004 | 4.6726856 | 30.583601 | 0.67071736 | 1.5111907 | 0.024124695 | 1.5492108999999998 | 1.439797 | 0.24763718664174345 | 0.8147690823204996 | 4.918219566345215 | -0.9567646980285645 | -4.656093599999999 | 0.40140057 | -2.9237971000000003 | -5.8753314 | 0.443355 | 1.7042981 | 10.802688999999999 | 0.08576387 | 0.41162340000000003 | 1.3014139999999998 | 7.6323996 | -0.8151264 | |||||||||||
| 6 | 5 | 116.0 | 0.0 | 756.0 | 1.0 | 756.0 | 3902.0 | 0.0 | -21.0 | -21.0 | 0.0 | 1.850646 | 2.4202513999999997 | 16.295288 | 0.4290005 | 1.3224964 | 0.050945385999999995 | 1.4303273 | 1.2031558 | 0.14155570039685988 | 0.5814409942921325 | 3.3915529251098637 | -0.8861622810363771 | -3.0170727000000004 | 0.6280401999999999 | -2.1003609 | -4.3324237 | 0.22347862 | 0.839418 | 5.326185 | 0.05343006 | 0.22746566 | 0.82214487 | 4.685924 | -0.7187973000000001 |