Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Discounted Return/Mean,Discounted Return/Stdev,Discounted Return/Max,Discounted Return/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min 1,0.0,1.0,1117.0,1117.0,1117.0,1117.0,1.0,,,0.0,,,,,,,,,,,,,,,,-1.5180229894995567,0.6998808293377133,-0.08930329112720292,-3.148474706421977,,,, 2,197.0,0.0,1905.0,1905.0,788.0,1905.0,0.9992908000000232,-21.0,-21.0,0.0,,,,0.0061254552800413965,0.0043050134300493936,0.034036897122859955,6.793363718315959e-05,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.024838297000000002,0.026057651,0.35551727,0.007053303,-2.4312329564518818,0.5717677860635341,-0.7105532272722921,-3.3662833646890835,,,, 3,436.0,0.0,2862.0,2862.0,957.0,2862.0,0.9984295000000516,-20.0,-20.0,0.0,,,,0.005806613470321504,0.0030052668817285165,0.014995453879237175,0.00033133718534372747,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.016070435,0.005444049,0.042856682,0.0075193606,-1.9651299496437336,0.7810357358559473,-0.3655772928295825,-3.2941461345643885,0.049539614,0.019873414,0.07862351,0.014469368 4,627.0,0.0,3623.0,3623.0,761.0,3623.0,0.9977446000000744,-21.0,-21.0,0.0,,,,0.006210500930166361,0.0030342874917922576,0.01707890443503857,0.0003939484595321119,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.014544703,0.0057146824,0.039981294,0.004752383,-2.5196481268264272,0.5839729089128289,-0.7105532272722921,-3.3699982440767453,,,, 5,855.0,0.0,4535.0,4535.0,912.0,4535.0,0.9969238000001012,-20.0,-20.0,0.0,,,,0.005768066168489065,0.0026172456712171143,0.012438178062438965,0.0004492225707508624,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.012245178999999998,0.0040054265,0.02887605,0.005628841,-1.9480557195901917,0.7908203737498453,-0.21371412100620468,-3.2726053764291825,0.04762842,0.011995862,0.06981121,0.027042279