1
0
mirror of https://github.com/gryf/coach.git synced 2026-01-29 03:25:47 +01:00
Files
coach/rl_coach/traces/Atari_NStepQ_pong/trace.csv
itaicaspi-intel fa4895f840 new traces
2018-09-13 11:47:36 +03:00

1.9 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinDiscounted Return/MeanDiscounted Return/StdevDiscounted Return/MaxDiscounted Return/MinEntropy/MeanEntropy/StdevEntropy/MaxEntropy/MinQ/MeanQ/StdevQ/MaxQ/MinQ Values/MeanQ Values/StdevQ Values/MaxQ Values/MinValue Loss/MeanValue Loss/StdevValue Loss/MaxValue Loss/Min
210.01.01117.01.01117.01117.00.50.0-1.51802298949955670.6998808293377133-0.08930329112720292-3.148474706421977
32163.00.0821.01.0821.01938.00.4919541999999965-21.0-21.00.0-2.4056525780639710.6237147471281423-0.7105532272722921-3.36911793289506270.253394700000000030.069963540.40677336-0.358972040.0352837370.102528441.04751351.1831225500000001e-05
43320.00.0782.01.0782.02720.00.4842905999999932-21.0-21.00.0-2.46142770696000430.5586658402302739-0.7105532272722921-3.3548528241808640.207151860.0622777859999999950.350042430.00364773230.059509410.132846200000000030.559848851.9053832e-05
54522.00.01009.01.01009.03729.00.4744023999999889-19.0-19.00.0-1.740348518175990.87365189809112520.29537702481737355-3.2298584539193550.19645240.069192370.404477150.00160046170.087285010.215071070.965321063.7585607e-05
65673.00.0755.01.0755.04484.00.4670033999999857-21.0-21.00.0-2.52464311296112860.5835765895797549-0.7105532272722921-3.36999824407674530.161219160.0305215210.267719980.092142790.114072820.23744670.78529850.00861873