mirror of
https://github.com/gryf/coach.git
synced 2026-01-07 14:24:16 +01:00
2.5 KiB
2.5 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Discounted Return/Mean | Discounted Return/Stdev | Discounted Return/Max | Discounted Return/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Advantages/Mean | Advantages/Stdev | Advantages/Max | Advantages/Min | Values/Mean | Values/Stdev | Values/Max | Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min | Policy Loss/Mean | Policy Loss/Stdev | Policy Loss/Max | Policy Loss/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 772.0 | 1.0 | 772.0 | 772.0 | 0.0 | 0.0 | -2.437332009209832 | 0.5666975756966289 | -0.7105532272722921 | -3.364332223379411 | |||||||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 821.0 | 1.0 | 821.0 | 1593.0 | 0.0 | 0.0 | -2.3375427452853184 | 0.562882024173797 | -0.7105532272722921 | -3.3225778431943085 | |||||||||||||||||||||||||||||||||||||
| 4 | 3 | 38.0 | 0.0 | 763.0 | 1.0 | 763.0 | 2356.0 | 0.0 | -21.0 | -21.0 | 0.0 | 0.9789834999999999 | 1.335424 | 5.44189 | 0.0016743626999999998 | -2.5178046202451694 | 0.5843148195084643 | -0.7105532272722921 | -3.3699982440767453 | 1.7676028999999998 | 0.03572973 | 1.791706 | 1.671823 | -0.15197662177900048 | 0.42457358040248205 | 0.7191236019134521 | -1.0115903615951538 | -1.094308 | 1.2064375 | 0.07422362 | -3.9562736000000003 | 0.18559802 | 0.18026318 | 0.5619888000000001 | 6.661260000000001e-06 | -0.27577358 | 0.58109426 | 0.6487745 | -1.4974867 | |||||||||||
| 5 | 4 | 75.0 | 0.0 | 740.0 | 1.0 | 740.0 | 3096.0 | 0.0 | -21.0 | -21.0 | 0.0 | 2.0474174 | 1.1940155000000001 | 5.4071739999999995 | 0.46045935 | -2.533184641659896 | 0.5861942513660167 | -0.7105532272722921 | -3.3699982440767453 | 1.4143568000000002 | 0.19615947 | 1.7779577000000002 | 1.16588 | 0.22533110479513804 | 0.4591070601365748 | 1.1359987258911133 | -0.9571647644042968 | -5.57011 | 0.6552074999999999 | -4.209009 | -6.2400413 | 0.2781368 | 0.12522699 | 0.57409745 | 0.15991631 | 0.31805104 | 0.5110617999999999 | 1.0385043999999999 | -0.8121315 | |||||||||||
| 6 | 5 | 113.0 | 0.0 | 755.0 | 1.0 | 755.0 | 3851.0 | 0.0 | -21.0 | -21.0 | 0.0 | 1.5748503999999999 | 0.6886424 | 2.8886950000000002 | 0.46616986 | -2.5246431129611286 | 0.5835765895797549 | -0.7105532272722921 | -3.3699982440767453 | 1.2100506000000002 | 0.029187731 | 1.4943258 | 1.1479353 | 0.08319399034654773 | 0.4409353372895212 | 1.0099096298217771 | -0.9640896320343018 | -4.039905 | 0.5391477 | -3.5903852 | -5.5438123 | 0.113760486 | 0.047593262000000004 | 0.21574955 | 0.062469393 | 0.0946553 | 0.4438318 | 0.6956541 | -0.781316 |