mirror of
https://github.com/gryf/coach.git
synced 2026-02-16 05:55:46 +01:00
3.2 KiB
3.2 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Discounted Return/Mean | Discounted Return/Stdev | Discounted Return/Max | Discounted Return/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 478.0 | 478.0 | 478.0 | 478.0 | 1.0 | 0.0 | 0.8983135579407482 | 0.9883070480354864 | 4.998577182999947 | 0.0 | |||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 956.0 | 956.0 | 478.0 | 956.0 | 1.0 | 0.0 | 0.5556227853017673 | 0.8997466572364776 | 3.6930440629040366 | 0.0 | |||||||||||||||||||||
| 4 | 3 | 0.0 | 1.0 | 1434.0 | 1434.0 | 478.0 | 1434.0 | 1.0 | 0.0 | 0.4880512587305408 | 1.2184315272991455 | 7.981622766961119 | 0.0 | |||||||||||||||||||||
| 5 | 4 | 120.0 | 0.0 | 1912.0 | 1912.0 | 478.0 | 1912.0 | 0.9995698000000142 | 9.0 | 9.0 | 0.0 | 0.009689851886287215 | 0.012757843524910169 | 0.07483358681201935 | 8.608748612459749e-05 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.062028848 | 0.038314 | 0.20554277 | 0.0071169026 | 0.5300200921798093 | 0.9662463591142564 | 6.5978001830226605 | 0.0 | |||||||
| 6 | 5 | 239.0 | 0.0 | 2390.0 | 2390.0 | 478.0 | 2390.0 | 0.9991396000000284 | 9.0 | 9.0 | 0.0 | 0.00922453978664864 | 0.01264670765167686 | 0.05902014672756195 | 0.0001334039407083765 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.067527555 | 0.042208317999999995 | 0.24960242 | 0.011927829 | 0.8217024339343054 | 0.9620331221605151 | 5.227525088445949 | 0.0 | |||||||
| 7 | 6 | 359.0 | 0.0 | 2868.0 | 2868.0 | 478.0 | 2868.0 | 0.9987094000000424 | 9.0 | 9.0 | 0.0 | 0.009785913288457477 | 0.01135032046581406 | 0.04819483682513237 | 0.0002595906553324312 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.07394858 | 0.041478443999999996 | 0.19334313 | 0.01290816 | 1.016981481614472 | 1.230148514321286 | 5.0096655797309415 | 0.0 | |||||||
| 8 | 7 | 478.0 | 0.0 | 3346.0 | 3346.0 | 478.0 | 3346.0 | 0.9982792000000568 | 12.0 | 12.0 | 0.0 | 0.006171494484522005 | 0.008621506455658536 | 0.04043177887797356 | 0.00014119730622041968 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.057615485 | 0.039739773 | 0.21918385 | 0.007797438 | 1.4498607827770469 | 1.761905543060363 | 5.277624938631289 | 0.0 | |||||||
| 9 | 8 | 598.0 | 0.0 | 3824.0 | 3824.0 | 478.0 | 3824.0 | 0.9978490000000708 | 3.0 | 3.0 | 0.0 | 0.0093270375176265 | 0.010865469313652756 | 0.044152088463306434 | 0.00019287978648208082 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.07268587 | 0.044536818 | 0.21599004 | 0.011831168000000001 | 0.2135882596755921 | 0.36137271310460906 | 2.1020603780226303 | 0.0 | |||||||
| 10 | 9 | 717.0 | 0.0 | 4302.0 | 4302.0 | 478.0 | 4302.0 | 0.997418800000085 | 8.0 | 8.0 | 0.0 | 0.007293934819504752 | 0.008668179302989909 | 0.042650774121284485 | 0.00018305983394384384 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.0654777 | 0.039301243 | 0.20190288 | 0.011749206000000002 | 0.5106683851193774 | 1.121313650225975 | 5.72916711150828 | 0.0 | 0.038647518 | 0.018493647 | 0.085213184 | -0.00357599 | |||
| 11 | 10 | 837.0 | 0.0 | 4780.0 | 4780.0 | 478.0 | 4780.0 | 0.9969886000000991 | 5.0 | 5.0 | 0.0 | 0.006626958637813611 | 0.007690525022001264 | 0.042473964393138885 | 0.00013746884360443798 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.061515365 | 0.03587848 | 0.1821926 | 0.012653646999999999 | 0.5892099563483264 | 0.4988523336071251 | 2.2961085753574486 | 0.0 |