mirror of
https://github.com/gryf/coach.git
synced 2026-01-31 13:05:55 +01:00
1.4 KiB
1.4 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 986.0 | 986.0 | 986.0 | 986.0 | 7.0 | 0.0 | |||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 1806.0 | 1806.0 | 820.0 | 1806.0 | 4.0 | 0.0 | |||||||||||||||||||||
| 4 | 3 | 207.0 | 0.0 | 2634.0 | 2634.0 | 828.0 | 2634.0 | 1.0 | -21.0 | -21.0 | 0.0 | 0.013430694482291505 | 0.012774117514024573 | 0.06467919796705246 | 0.0005054873763583599 | 0.0002500000000000001 | 1.0842021724855042e-19 | 0.00025 | 0.00025 | 0.013462509 | 0.005010004 | 0.032169305 | 0.0046610474 | |||||||
| 5 | 4 | 433.0 | 0.0 | 3538.0 | 3538.0 | 904.0 | 3538.0 | 1.0 | -21.0 | -21.0 | 0.0 | 0.013214294455993912 | 0.012243776759493771 | 0.048550304025411606 | 0.00030727600096724933 | 0.0002500000000000001 | 1.0842021724855042e-19 | 0.00025 | 0.00025 | 0.012283348500000001 | 0.004644497 | 0.032848116000000004 | 0.0047284905 | |||||||
| 6 | 5 | 664.0 | 0.0 | 4462.0 | 4462.0 | 924.0 | 4462.0 | 2.0 | -20.0 | -20.0 | 0.0 | 0.013385360111538885 | 0.013904787720461907 | 0.06079941987991332 | 0.0005098563269712031 | 0.0002500000000000001 | 1.0842021724855042e-19 | 0.00025 | 0.00025 | 0.010943641 | 0.0043348954 | 0.03260831 | 0.0045090048 | 0.00066530565 | 0.0129122045 | 0.024260167000000003 | -0.034502137 |