mirror of
https://github.com/gryf/coach.git
synced 2026-02-13 20:35:48 +01:00
2.5 KiB
2.5 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Discounted Return/Mean | Discounted Return/Stdev | Discounted Return/Max | Discounted Return/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Advantages/Mean | Advantages/Stdev | Advantages/Max | Advantages/Min | Values/Mean | Values/Stdev | Values/Max | Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min | Policy Loss/Mean | Policy Loss/Stdev | Policy Loss/Max | Policy Loss/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 772.0 | 1.0 | 772.0 | 772.0 | 0.0 | 0.0 | -2.437332009209832 | 0.5666975756966289 | -0.7105532272722921 | -3.364332223379411 | |||||||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 821.0 | 1.0 | 821.0 | 1593.0 | 0.0 | 0.0 | -2.3375427452853184 | 0.562882024173797 | -0.7105532272722921 | -3.3225778431943085 | |||||||||||||||||||||||||||||||||||||
| 4 | 3 | 38.0 | 0.0 | 763.0 | 1.0 | 763.0 | 2356.0 | 0.0 | -21.0 | -21.0 | 0.0 | 1.2098866000000001 | 1.449215 | 5.3609241999999995 | 0.00244356 | -2.5178046202451694 | 0.5843148195084643 | -0.7105532272722921 | -3.3699982440767453 | 1.7662544999999998 | 0.03266678 | 1.7917435000000002 | 1.6590552 | -0.09202062882188904 | 0.4331878633448028 | 0.8984384536743164 | -0.9984065890312196 | -1.7017021 | 1.594185 | -0.041688699999999995 | -4.9379349999999995 | 0.22111915 | 0.19444092 | 0.59284925 | 2.0590694e-05 | -0.1670907 | 0.6072369000000001 | 0.91119975 | -1.4746135 | |||||||||||
| 5 | 4 | 75.0 | 0.0 | 740.0 | 1.0 | 740.0 | 3096.0 | 0.0 | -21.0 | -21.0 | 0.0 | 1.9744401999999999 | 0.8914412 | 4.1053777 | 0.54077625 | -2.533184641659896 | 0.5861942513660167 | -0.7105532272722921 | -3.3699982440767453 | 1.1498803 | 0.17261624 | 1.7622604 | 0.99270844 | 0.19542027049594451 | 0.4488243660076464 | 0.9995923042297364 | -0.9500741958618164 | -5.264807 | 0.15003455 | -4.995072 | -5.481164 | 0.22140607 | 0.09454554 | 0.4738923 | 0.1269643 | 0.24964908 | 0.42948514 | 0.77320266 | -0.62486225 | |||||||||||
| 6 | 5 | 113.0 | 0.0 | 755.0 | 1.0 | 755.0 | 3851.0 | 0.0 | -21.0 | -21.0 | 0.0 | 1.6745409 | 0.7766086999999999 | 3.4373443 | 0.42763662 | -2.5246431129611286 | 0.5835765895797549 | -0.7105532272722921 | -3.3699982440767453 | 0.8715389000000001 | 0.11778103 | 1.3657371999999999 | 0.70607877 | 0.02502031455168853 | 0.4342484515718581 | 0.8662590980529785 | -0.9686682224273682 | -3.4500135999999997 | 0.4302874 | -3.1447627999999996 | -4.761848400000001 | 0.10419229 | 0.05407177 | 0.22493912 | 0.052366237999999996 | 0.030020599999999995 | 0.3442417 | 0.68716383 | -0.674335 |