Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Entropy/Mean,Entropy/Stdev,Entropy/Max,Entropy/Min,Advantages/Mean,Advantages/Stdev,Advantages/Max,Advantages/Min,Values/Mean,Values/Stdev,Values/Max,Values/Min,Value Loss/Mean,Value Loss/Stdev,Value Loss/Max,Value Loss/Min,Policy Loss/Mean,Policy Loss/Stdev,Policy Loss/Max,Policy Loss/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min,TD targets/Mean,TD targets/Stdev,TD targets/Max,TD targets/Min,actions/Mean,actions/Stdev,actions/Max,actions/Min 1,0.0,1.0,1000.0,1.0,1000.0,1000.0,0.0,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 2,0.0,1.0,2000.0,2.0,1000.0,2000.0,0.0,,,1.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 3,999.0,0.0,3000.0,3.0,1000.0,3000.0,-0.017666830179174003,0.0,0.0,1.0,,,,0.0038191572866389523,0.0037619606969040414,0.026500405743718147,0.0005351771251298487,0.00010000000000000003,4.0657581468206416e-20,0.0001,0.0001,0.14963633,0.12557492,1.0296046,0.022291046000000002,,,,,,,,,,,,,,,,,,,,,-0.06052997,0.07319117,0.09228404,-0.81788486,-0.2878782119320924,0.18290294876848567,0.15647711277008056,-1.5949552059173584,0.001995146476856135,0.7060122989726414,1.2513357156871705,-1.2238670209506697 4,1999.0,0.0,4000.0,4.0,1000.0,4000.0,-0.039999362478752916,0.0,0.0,1.0,,,,0.0005999824570681085,0.0005200396841794251,0.007392321713268756,0.0001155680656665936,0.00010000000000000003,2.7105054312137605e-20,0.0001,0.0001,0.035074446,0.023916507000000004,0.27376664,0.007415438000000001,,,,,,,,,,,,,,,,,,,,,-0.055866152,0.03557198,0.048398294,-0.20761846,-0.1228111611005996,0.10064295523824936,0.10938632614910604,-0.9385637390613556,0.3813050626909247,0.8586455988935526,1.9421025144380244,-1.2856749345811207 5,2999.0,0.0,5000.0,5.0,1000.0,5000.0,0.17145601483403705,0.0,0.0,0.0,,,,0.00016860231386453962,8.468335419663504e-05,0.0008860444650053977,4.115650153835304e-05,0.00010000000000000003,2.7105054312137605e-20,0.0001,0.0001,0.013316613,0.0064833634999999995,0.055022966,0.0038480079,,,,,,,,,,,,,,,,,,,,,-0.04638309,0.028830105,-0.0011850878,-0.36706513,-0.055944047987441965,0.061761207984293,0.16325088679790498,-0.6333049428462982,0.2629062319462116,0.7653580979608734,1.6618100999356682,-1.1699750612176198