mirror of
https://github.com/gryf/coach.git
synced 2026-01-29 11:35:51 +01:00
4.1 KiB
4.1 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Discounted Return/Mean | Discounted Return/Stdev | Discounted Return/Max | Discounted Return/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Advantages/Mean | Advantages/Stdev | Advantages/Max | Advantages/Min | Values/Mean | Values/Stdev | Values/Max | Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min | Policy Loss/Mean | Policy Loss/Stdev | Policy Loss/Max | Policy Loss/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 478.0 | 1.0 | 478.0 | 478.0 | 0.05 | 0.0 | 0.6772166738139332 | 1.3606583998522768 | 7.111435392915562 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 478.0 | 1.0 | 478.0 | 956.0 | 0.05 | 0.0 | 0.3461585865957836 | 0.7129541964353258 | 3.4825898934060247 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 4 | 3 | 0.0 | 1.0 | 478.0 | 1.0 | 478.0 | 1434.0 | 0.05 | 0.0 | 0.8221885517216162 | 1.0710747025505476 | 7.5093869236316815 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 5 | 4 | 23.0 | 0.0 | 478.0 | 1.0 | 478.0 | 1912.0 | 0.05 | 3.0 | 3.0 | 0.0 | 6.035387999999999 | 19.725208 | 82.50349 | 0.039248765 | 0.12383280129798925 | 0.5223793934675833 | 2.79990211919977 | 0.0 | 0.0452739237450773 | 0.2183428467460978 | 1.8627784317196168 | -0.0030643000370221317 | -0.004365872 | 0.0017351342000000002 | -0.0009706963 | -0.013218256000000001 | 0.03401077 | 0.11164869 | 0.451847 | 4.2722320000000005e-07 | 1.0746696000000002 | 3.4710016 | 13.569623000000002 | -0.042516552 | |||||||||||||||
| 6 | 5 | 47.0 | 0.0 | 478.0 | 1.0 | 478.0 | 2390.0 | 0.05 | 4.0 | 4.0 | 0.0 | 4.0586433 | 13.706188000000001 | 61.88896 | 0.01625376 | 0.1100642112599555 | 0.4374646870965546 | 3.5802294655332134 | 0.0 | 0.06605603117614679 | 0.2828204733180097 | 2.495711163345754 | -0.003739734889965058 | -0.014320336000000001 | 0.001017332 | -0.011700756999999999 | -0.019422526000000002 | 0.06243892400000001 | 0.22280176 | 1.0344708 | 5.7007526e-07 | 1.0248749 | 3.348449 | 14.245454999999998 | 0.0033072747999999996 | |||||||||||||||
| 7 | 6 | 71.0 | 0.0 | 478.0 | 1.0 | 478.0 | 2868.0 | 0.05 | 5.0 | 5.0 | 0.0 | 5.894111 | 25.05461 | 122.94565 | 0.023589091 | 0.09171214216631716 | 0.4844599433567191 | 4.654235241755948 | 0.0 | 0.049990671231890695 | 0.3035343549912095 | 3.2950532226726725 | -0.002476382634522309 | -0.013464768 | 0.0015044829999999998 | -0.0038845115000000004 | -0.017055605 | 0.06480691 | 0.29358354 | 1.4410233000000001 | 3.734214e-07 | 0.8669504000000001 | 3.7694442 | 18.52095 | -0.0042024343 | |||||||||||||||
| 8 | 7 | 95.0 | 0.0 | 478.0 | 1.0 | 478.0 | 3346.0 | 0.05 | 5.0 | 5.0 | 0.0 | 7.687636 | 20.715212 | 84.180954 | 0.028861007 | 0.3429683427703999 | 0.5788530473857294 | 3.4984647932875417 | 0.0 | 0.08723572006555756 | 0.3080234455301096 | 2.525255932768265 | -0.0040773123184797465 | 0.0049094097 | 0.0043921572999999995 | 0.011444922 | -0.007210325 | 0.07753111 | 0.24353394 | 1.1304462 | 3.6353214e-07 | 2.1490667 | 5.9981136 | 24.37817 | -0.043448977 | |||||||||||||||
| 9 | 8 | 119.0 | 0.0 | 478.0 | 1.0 | 478.0 | 3824.0 | 0.05 | 2.0 | 2.0 | 0.0 | 1.5874879 | 6.5180235 | 32.02797 | 0.023313208 | 0.04601948363571996 | 0.22472693959922946 | 1.8345137614500877 | 0.0 | 0.008022510747359511 | 0.08955419064160207 | 1.0005814651570817 | -0.003758238096650968 | 0.0059175556999999995 | 0.0011794145 | 0.008881162 | 0.0026903595 | 0.0042831437 | 0.015627237 | 0.073625945 | 4.8217976e-07 | 0.14996052 | 0.7111329000000001 | 3.4817169000000003 | -0.056617767 | |||||||||||||||
| 10 | 9 | 143.0 | 0.0 | 478.0 | 1.0 | 478.0 | 4302.0 | 0.05 | 2.0 | 2.0 | 0.0 | 2.9201734 | 9.209149 | 34.853527 | 0.024623917000000002 | 0.09876412763598223 | 0.3162928485152334 | 1.6894490858690778 | 0.0 | 0.030283883503414787 | 0.15808164771534552 | 1.0039386003974171 | -0.004561017796980071 | 0.0048816567 | 0.0016686192000000002 | 0.0079843905 | -0.0026530582 | 0.017186822 | 0.056464136 | 0.22929375 | 2.3640719e-07 | 0.52893436 | 1.7419220000000002 | 6.477185700000001 | -0.04866889 | |||||||||||||||
| 11 | 10 | 167.0 | 0.0 | 478.0 | 1.0 | 478.0 | 4780.0 | 0.05 | 0.0 | 0.0 | 0.0 | 0.07378042 | 0.035627235 | 0.1401871 | 0.011221955 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0006671480381455103 | 0.0011843871302212494 | 0.0037133864868810437 | -0.003517640307545661 | -0.008119860999999999 | 0.0009026574000000001 | -0.005373695 | -0.010452669 | 1.3184773000000002e-06 | 1.1577918e-06 | 4.8367815e-06 | 3.1812826e-07 | 0.012648877 | 0.012855935 | 0.034687527 | -0.009518709 |