Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Discounted Return/Mean,Discounted Return/Stdev,Discounted Return/Max,Discounted Return/Min,Entropy/Mean,Entropy/Stdev,Entropy/Max,Entropy/Min,Advantages/Mean,Advantages/Stdev,Advantages/Max,Advantages/Min,Values/Mean,Values/Stdev,Values/Max,Values/Min,Value Loss/Mean,Value Loss/Stdev,Value Loss/Max,Value Loss/Min,Policy Loss/Mean,Policy Loss/Stdev,Policy Loss/Max,Policy Loss/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min,TD targets/Mean,TD targets/Stdev,TD targets/Max,TD targets/Min,actions/Mean,actions/Stdev,actions/Max,actions/Min 1,0.0,1.0,1000.0,1.0,1000.0,1000.0,0.0,,,0.0,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 2,0.0,1.0,2000.0,2.0,1000.0,2000.0,0.0,,,1.0,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 3,999.0,0.0,3000.0,3.0,1000.0,3000.0,-0.017666830179174003,0.0,0.0,1.0,,,,0.0029464550291862087,0.0025701377750570642,0.02788718044757843,0.0006394493393599987,0.00010000000000000003,4.0657581468206416e-20,0.0001,0.0001,0.13357769,0.117093444,1.2991068000000001,0.026759505,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,0.15686034,0.0627305,0.39373034,-0.3585922,-0.0033900204140713077,0.15875771875068714,0.5218342781066895,-0.5829120719432831,0.018559266145446292,0.8379639652873171,1.3133153498255412,-1.2431993702510542 4,1999.0,0.0,4000.0,4.0,1000.0,4000.0,-0.039999362478752916,0.0021780076323496323,0.02178007632349632,1.0,,,,0.0006978411426705012,0.00034956689330895783,0.002974547212943435,0.0001349833473796025,0.00010000000000000003,4.0657581468206416e-20,0.0001,0.0001,0.041489832000000004,0.02103676,0.1902087,0.010260164,2.3785131604782346e-05,0.00021786301549431403,0.002166953447005496,0.0,,,,,,,,,,,,,,,,,,,,,0.10348537,0.037828527,0.58358335,-0.20245944,0.1538198277351862,0.12114966051922052,0.6158200460672378,-0.3744521605968476,0.08143989407802325,0.8094344175263435,1.2337204414679308,-1.3201327969582874 5,2999.0,0.0,5000.0,5.0,1000.0,5000.0,0.17145601483403705,0.0,0.0,0.0,,,,0.0004858621195543909,0.0005712790351431513,0.00709826499223709,9.616250463295728e-05,0.00010000000000000003,4.0657581468206416e-20,0.0001,0.0001,0.028758908,0.02100075,0.23695771,0.007761135,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,-0.082560025,0.042022076,0.32945228,-0.19682288,0.20584864377714734,0.1282641644604344,0.6959143048524856,-0.15129065528511998,-0.22636502595053595,0.7716659678121603,1.6171153782369072,-1.244061515013705