Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Discounted Return/Mean,Discounted Return/Stdev,Discounted Return/Max,Discounted Return/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min 1,0.0,1.0,1117.0,1117.0,1117.0,1117.0,1.0,,,0.0,,,,,,,,,,,,,,,,-1.5180229894995567,0.6998808293377133,-0.08930329112720292,-3.148474706421977,,,, 2,205.0,0.0,1937.0,1937.0,820.0,1937.0,0.9991882000000176,-21.0,-21.0,0.0,,,,0.013030857562661391,0.014064394699560966,0.06334743648767471,2.0799992853426374e-05,0.00010000000000000002,1.3552527156068802e-20,0.0001,0.0001,0.14124954,0.15512846,0.9356431000000001,0.0037519853000000003,-2.3361342922088504,0.784322378590693,-0.38878391807422696,-3.369599601005491,,,, 3,413.0,0.0,2768.0,2768.0,831.0,2768.0,0.9983655100000356,-21.0,-21.0,0.0,,,,0.012065147789319152,0.014264555162560488,0.08900705724954605,3.336678491905332e-05,0.00010000000000000003,2.7105054312137605e-20,0.0001,0.0001,0.064539365,0.04878198,0.3403983,0.007120714,-2.320394201181889,0.6047235028955231,-0.7105532272722921,-3.350537576335216,0.018258663,0.008410561,0.03039161,0.0031909812 4,667.0,0.0,3783.0,3783.0,1015.0,3783.0,0.9973606600000572,-20.0,-20.0,0.0,,,,0.013943941339460959,0.012418257338636593,0.05291028320789337,1.9593317119870335e-05,0.00010000000000000002,1.3552527156068802e-20,0.0001,0.0001,0.11973736,0.119031124,0.8044271,0.0012378334,-1.7531357837449677,0.7448577440634202,-0.1288331810939122,-3.2971074888190803,-0.02297779,0.0040406515,-0.01676904,-0.027261153 5,867.0,0.0,4585.0,4585.0,802.0,4585.0,0.9965666800000744,-21.0,-21.0,0.0,,,,0.012252497132776626,0.013290761767708568,0.06478248536586761,3.438068233663216e-05,0.00010000000000000002,1.3552527156068802e-20,0.0001,0.0001,0.06426661,0.056941237,0.30591047,0.005093545,-2.406465837413259,0.5636980823469648,-0.7105532272722921,-3.36383697254212,-0.0066491687,0.007894201,0.006075496,-0.020406375