mirror of
https://github.com/gryf/coach.git
synced 2025-12-28 09:22:31 +01:00
2.1 KiB
2.1 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Discounted Return/Mean | Discounted Return/Stdev | Discounted Return/Max | Discounted Return/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Advantages/Mean | Advantages/Stdev | Advantages/Max | Advantages/Min | Values/Mean | Values/Stdev | Values/Max | Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min | Policy Loss/Mean | Policy Loss/Stdev | Policy Loss/Max | Policy Loss/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min | TD targets/Mean | TD targets/Stdev | TD targets/Max | TD targets/Min | actions/Mean | actions/Stdev | actions/Max | actions/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 97.0 | 1.0 | 25.0 | 25.0 | 0.0 | 0.0 | -480.6903328824848 | 254.92315277284558 | -40.0 | -888.7145624034126 | |||||||||||||||||||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 194.0 | 2.0 | 25.0 | 50.0 | 0.0 | 0.0 | -480.6903328824848 | 254.92315277284558 | -40.0 | -888.7145624034126 | |||||||||||||||||||||||||||||||||||||||||||||||||
| 4 | 3 | 0.0 | 0.0 | 291.0 | 3.0 | 25.0 | 75.0 | -0.013705192291281485 | -1000.0 | -1000.0 | 0.0 | -480.6903328824848 | 254.92315277284558 | -40.0 | -888.7145624034126 | -0.04823625 | 0.023806227000000003 | 0.021652615 | -0.10660829 | -0.12859384303253607 | 0.18325901915279744 | 0.04875503852963448 | -0.7279058694839478 | |||||||||||||||||||||||||||||||||||||||
| 5 | 4 | 0.0 | 0.0 | 388.0 | 4.0 | 25.0 | 100.0 | -0.02430443169727376 | -1000.0 | -1000.0 | 0.0 | -480.6903328824848 | 254.92315277284558 | -40.0 | -888.7145624034126 | -0.04093397 | 0.045165304 | 0.09869731 | -0.120250694 | -0.16543949867939364 | 0.2667459507740165 | 0.1111396551132202 | -1.4467874369949862 | |||||||||||||||||||||||||||||||||||||||
| 6 | 5 | 0.0 | 0.0 | 485.0 | 5.0 | 25.0 | 125.0 | 0.0 | -1000.0 | -1000.0 | 0.0 | -480.6903328824848 | 254.92315277284558 | -40.0 | -888.7145624034126 | -0.050059527 | 0.0325627 | 0.057268333 | -0.10848339 | -0.10842234288811206 | 0.1848078590593072 | 0.10341470901523106 | -0.6624982986866047 |