mirror of
https://github.com/gryf/coach.git
synced 2026-01-04 12:54:17 +01:00
3.1 KiB
3.1 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Discounted Return/Mean | Discounted Return/Stdev | Discounted Return/Max | Discounted Return/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Advantages/Mean | Advantages/Stdev | Advantages/Max | Advantages/Min | Values/Mean | Values/Stdev | Values/Max | Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min | Policy Loss/Mean | Policy Loss/Stdev | Policy Loss/Max | Policy Loss/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min | TD targets/Mean | TD targets/Stdev | TD targets/Max | TD targets/Min | actions/Mean | actions/Stdev | actions/Max | actions/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 1001.0 | 1.0 | 1001.0 | 1001.0 | 0.0 | 0.0 | 0.1810549437584988 | 0.08342612458204374 | 0.3657155727590055 | 0.012114535848885052 | |||||||||||||||||||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 2002.0 | 2.0 | 1001.0 | 2002.0 | 0.0 | 1.0 | 0.10514369548395547 | 0.05043065738920054 | 0.21524430347618226 | 0.0011643643789458708 | |||||||||||||||||||||||||||||||||||||||||||||||||
| 4 | 3 | 1000.0 | 0.0 | 3003.0 | 3.0 | 1001.0 | 3003.0 | -0.1185302492771778 | 7.715022587137692 | 77.15022587137695 | 1.0 | 2.392776927604245e-05 | 0.0001238392879134552 | 0.003135734703391791 | 1.3632770787808113e-06 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.0050078793 | 0.0064642574 | 0.098884284 | 0.0005620557 | 0.7529912365203498 | 0.4617358686541096 | 1.594227125555209 | 0.00353194497002051 | 0.050345387000000005 | 0.05729694 | 0.19809167 | -0.08215243 | 0.05447568218516847 | 0.04079396823019403 | 0.1626849260711809 | -0.023734710598228542 | -0.34397406029780203 | 0.5815794471343353 | 1.0123427704556696 | -1.4948724317067326 | |||||||||||||||||||||||
| 5 | 4 | 2001.0 | 0.0 | 4004.0 | 4.0 | 1001.0 | 4004.0 | -0.2048510260598676 | 11.15149430448684 | 111.51494304486856 | 1.0 | 4.744879189274797e-05 | 0.00012290942505022024 | 0.0018345331773161886 | 1.965395085790078e-06 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.011079958999999999 | 0.012011923 | 0.13484268 | 0.0006308630000000001 | 1.0427006689416267 | 0.4698636052853145 | 2.4480673370988217 | 0.02353826061555796 | 0.08695571 | 0.039015066 | 0.20370862 | -0.04777467 | 0.07752333968630509 | 0.04623246871067645 | 0.2261723560740124 | -0.0225689002695848 | -0.8518067738026681 | 0.6092545155137064 | 1.0646579642019018 | -1.5449345365759264 | |||||||||||||||||||||||
| 6 | 5 | 3002.0 | 0.0 | 5005.0 | 5.0 | 1001.0 | 5005.0 | -0.02134772535498328 | 13.93309848305012 | 139.33098483050114 | 0.0 | 4.3079992066850536e-05 | 6.204749497676766e-05 | 0.0006331046461127697 | 2.68419535132125e-06 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.012529638000000001 | 0.010803898999999999 | 0.07596945 | 0.0010158704 | 1.2820087653786647 | 0.3980341965211882 | 2.1725203141180294 | 6.57125185393007e-05 | 0.42543635 | 0.20454627 | 0.8917187 | 0.05104744 | 0.08902145835758844 | 0.04616376194044128 | 0.2426698236415496 | -0.00472626581328319 | -0.5524111559396501 | 0.6382799035600116 | 1.0798330140144352 | -1.2145754834427926 |