mirror of
https://github.com/gryf/coach.git
synced 2026-02-01 21:35:45 +01:00
2.5 KiB
2.5 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Advantages/Mean | Advantages/Stdev | Advantages/Max | Advantages/Min | Values/Mean | Values/Stdev | Values/Max | Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min | Policy Loss/Mean | Policy Loss/Stdev | Policy Loss/Max | Policy Loss/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min | TD targets/Mean | TD targets/Stdev | TD targets/Max | TD targets/Min | actions/Mean | actions/Stdev | actions/Max | actions/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 1000.0 | 1.0 | 1000.0 | 1000.0 | 0.0 | 0.0 | |||||||||||||||||||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 2000.0 | 2.0 | 1000.0 | 2000.0 | 0.0 | 1.0 | |||||||||||||||||||||||||||||||||||||||||||||||||
| 4 | 3 | 999.0 | 0.0 | 3000.0 | 3.0 | 1000.0 | 3000.0 | -0.017666830179174003 | 0.0 | 0.0 | 1.0 | 0.0038191572866389523 | 0.0037619606969040414 | 0.026500405743718147 | 0.0005351771251298487 | 0.00010000000000000003 | 4.0657581468206416e-20 | 0.0001 | 0.0001 | 0.14963633 | 0.12557492 | 1.0296046 | 0.022291046000000002 | -0.06052997 | 0.07319117 | 0.09228404 | -0.81788486 | -0.2878782119320924 | 0.18290294876848567 | 0.15647711277008056 | -1.5949552059173584 | 0.001995146476856135 | 0.7060122989726414 | 1.2513357156871705 | -1.2238670209506697 | |||||||||||||||||||||||
| 5 | 4 | 1999.0 | 0.0 | 4000.0 | 4.0 | 1000.0 | 4000.0 | -0.039999362478752916 | 0.0 | 0.0 | 1.0 | 0.0005999824570681085 | 0.0005200396841794251 | 0.007392321713268756 | 0.0001155680656665936 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.035074446 | 0.023916507000000004 | 0.27376664 | 0.007415438000000001 | -0.055866152 | 0.03557198 | 0.048398294 | -0.20761846 | -0.1228111611005996 | 0.10064295523824936 | 0.10938632614910604 | -0.9385637390613556 | 0.3813050626909247 | 0.8586455988935526 | 1.9421025144380244 | -1.2856749345811207 | |||||||||||||||||||||||
| 6 | 5 | 2999.0 | 0.0 | 5000.0 | 5.0 | 1000.0 | 5000.0 | 0.17145601483403705 | 0.0 | 0.0 | 0.0 | 0.00016860231386453962 | 8.468335419663504e-05 | 0.0008860444650053977 | 4.115650153835304e-05 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.013316613 | 0.0064833634999999995 | 0.055022966 | 0.0038480079 | -0.04638309 | 0.028830105 | -0.0011850878 | -0.36706513 | -0.055944047987441965 | 0.061761207984293 | 0.16325088679790498 | -0.6333049428462982 | 0.2629062319462116 | 0.7653580979608734 | 1.6618100999356682 | -1.1699750612176198 |