1
0
mirror of https://github.com/gryf/coach.git synced 2026-01-10 15:54:12 +01:00
Files
coach/rl_coach/traces/Atari_NStepQ_pong/trace.csv
2018-10-02 17:55:16 +03:00

1.9 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinDiscounted Return/MeanDiscounted Return/StdevDiscounted Return/MaxDiscounted Return/MinEntropy/MeanEntropy/StdevEntropy/MaxEntropy/MinQ/MeanQ/StdevQ/MaxQ/MinQ Values/MeanQ Values/StdevQ Values/MaxQ Values/MinValue Loss/MeanValue Loss/StdevValue Loss/MaxValue Loss/Min
210.01.01117.01.01117.01117.00.50.0-1.51802298949955670.6998808293377133-0.08930329112720292-3.148474706421977
32151.00.0760.01.0760.01877.00.4925519999999968-21.0-21.00.0-2.52053724683002530.5838419974113738-0.7105532272722921-3.36999824407674530.071124270.075819780.32078072-0.060632680.115600410.236686020.972858250.00011849090000000001
43353.00.01008.01.01008.02885.00.4826735999999925-20.0-20.00.0-1.87209549867832110.7097144372888278-0.3754689651451796-3.32257784319430850.126765430.065863190.30209106-0.043023460.067636740.174963370.884734639.743495999999999e-05
54516.00.0814.01.0814.03699.00.474696399999989-20.0-20.00.0-2.1931743923279560.69198455704147-0.4780715307780122-3.33725972525167650.107004380.0383309420.317173540.037218390.103716450.232032000000000021.00524960.00057114265
65703.00.0932.01.0932.04631.00.465562799999985-21.0-21.00.0-2.1223178330232110.6454978854674346-0.7105532272722921-3.3548528241808640.160400730.0448965059999999960.294402450.0631723550.046089610.117846930.54444920.00014167171000000002