Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Entropy/Mean,Entropy/Stdev,Entropy/Max,Entropy/Min,Advantages/Mean,Advantages/Stdev,Advantages/Max,Advantages/Min,Values/Mean,Values/Stdev,Values/Max,Values/Min,Value Loss/Mean,Value Loss/Stdev,Value Loss/Max,Value Loss/Min,Policy Loss/Mean,Policy Loss/Stdev,Policy Loss/Max,Policy Loss/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min,TD targets/Mean,TD targets/Stdev,TD targets/Max,TD targets/Min,actions/Mean,actions/Stdev,actions/Max,actions/Min 1,0.0,1.0,1000.0,1.0,1000.0,1000.0,0.0,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 2,0.0,1.0,2000.0,2.0,1000.0,2000.0,0.0,,,1.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 3,999.0,0.0,3000.0,3.0,1000.0,3000.0,-0.017666830179174003,0.0,0.0,1.0,,,,0.005126546850151572,0.004660130005352106,0.05132860690355301,0.0005627279169857502,0.00010000000000000003,4.0657581468206416e-20,0.0001,0.0001,0.1678458,0.12852536,1.5926425,0.024081124,,,,,,,,,,,,,,,,,,,,,0.24388833,0.11236252,0.42840713,-0.8727883,-0.01800971130044855,0.1942566799457346,0.5385999780893326,-0.9172834092378616,-0.11966089038501045,0.8962365587209448,1.3716363433793126,-1.5680451743766328 4,1999.0,0.0,4000.0,4.0,1000.0,4000.0,-0.039999362478752916,0.0,0.0,1.0,,,,0.0008180646479820358,0.000529273102626917,0.0054473504424095145,0.00014673141413368285,0.00010000000000000003,4.0657581468206416e-20,0.0001,0.0001,0.0469651,0.025094092000000002,0.22221590000000002,0.010784525,,,,,,,,,,,,,,,,,,,,,0.14337498,0.17592207,0.33719423,-0.28446856,0.20208258858056294,0.13431578391837634,0.5768654608726501,-0.5833876812458039,-0.2705900161928217,0.9272508528236816,1.9572209345620784,-1.4727463554915825 5,2999.0,0.0,5000.0,5.0,1000.0,5000.0,0.17145601483403705,0.0,0.0,0.0,,,,0.0003958249435308753,0.00031769597300822634,0.0040870513767004004,0.00010442566417623311,0.00010000000000000003,4.0657581468206416e-20,0.0001,0.0001,0.025218817999999997,0.013975793500000002,0.14064097,0.0070197446999999994,,,,,,,,,,,,,,,,,,,,,-0.04435015,0.030164617999999997,0.124313995,-0.2207815,0.26081367032274555,0.12723809202247516,0.5931554335355759,-0.2620022776722908,-0.2959362262223715,0.6939703144112135,1.0669463809202309,-1.416814717430604