mirror of
https://github.com/gryf/coach.git
synced 2026-01-26 01:15:45 +01:00
2.5 KiB
2.5 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Discounted Return/Mean | Discounted Return/Stdev | Discounted Return/Max | Discounted Return/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Advantages/Mean | Advantages/Stdev | Advantages/Max | Advantages/Min | Values/Mean | Values/Stdev | Values/Max | Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min | Policy Loss/Mean | Policy Loss/Stdev | Policy Loss/Max | Policy Loss/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 881.0 | 1.0 | 881.0 | 881.0 | 0.0 | 0.0 | -2.041213323423532 | 0.9183659584454216 | 0.4796594773496936 | -3.352701688899176 | |||||||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 1043.0 | 1.0 | 1043.0 | 1924.0 | 0.0 | 0.0 | -1.7995206029952229 | 0.6440801924897366 | -0.3927560490055896 | -3.2471439401326068 | |||||||||||||||||||||||||||||||||||||
| 4 | 3 | 38.0 | 0.0 | 763.0 | 1.0 | 763.0 | 2687.0 | 0.0 | -21.0 | -21.0 | 0.0 | 1.6584430000000001 | 1.5743899 | 6.016515 | 0.0040417006 | -2.5178046202451694 | 0.5843148195084643 | -0.7105532272722921 | -3.3699982440767453 | 1.7365953 | 0.060906168 | 1.789465 | 1.5938344 | -0.13552558197345782 | 0.4155380274318002 | 0.5708987712860107 | -1.000117301940918 | -1.2616656000000002 | 1.0273496 | -0.025292996 | -3.1295607000000003 | 0.09551952 | 0.105760135 | 0.37662047 | 3.6489364000000002e-06 | -0.24352022 | 0.5668899000000001 | 0.5707099 | -1.4730957 | |||||||||||
| 5 | 4 | 75.0 | 0.0 | 740.0 | 1.0 | 740.0 | 3427.0 | 0.0 | -21.0 | -21.0 | 0.0 | 3.23845 | 1.8577351999999998 | 7.4290956999999995 | 0.28597927 | -2.533184641659896 | 0.5861942513660167 | -0.7105532272722921 | -3.3699982440767453 | 1.3982204 | 0.15588406 | 1.5950133999999998 | 1.0844505 | -0.06379161493645774 | 0.4273890014930976 | 0.5925705432891846 | -0.9768838882446288 | -2.6351327999999996 | 0.30348834 | -2.275514 | -3.2641757 | 0.093365364 | 0.06693241 | 0.24693722 | 0.032394797 | -0.09340415 | 0.49059078 | 0.462054 | -1.0635808 | |||||||||||
| 6 | 5 | 113.0 | 0.0 | 755.0 | 1.0 | 755.0 | 4182.0 | 0.0 | -21.0 | -21.0 | 0.0 | 3.0537505 | 1.9605529 | 8.232577000000001 | 0.2008676 | -2.5246431129611286 | 0.5835765895797549 | -0.7105532272722921 | -3.3699982440767453 | 1.265182 | 0.080454364 | 1.3671972 | 1.1099908 | -0.08276949251020277 | 0.4270986000616472 | 0.5576775074005127 | -0.9796819686889648 | -2.353897 | 0.2969966 | -2.0010118 | -3.076977 | 0.09463200000000001 | 0.07529009 | 0.26480454 | 0.025184255 | -0.10958841400000001 | 0.47672352 | 0.5033947 | -1.1684946000000003 |