1
0
mirror of https://github.com/gryf/coach.git synced 2026-02-14 04:45:50 +01:00
Files
coach/rl_coach/traces/Atari_A3C_LSTM_pong/trace.csv
2018-08-20 13:01:30 +03:00

2.0 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinEntropy/MeanEntropy/StdevEntropy/MaxEntropy/MinAdvantages/MeanAdvantages/StdevAdvantages/MaxAdvantages/MinValues/MeanValues/StdevValues/MaxValues/MinValue Loss/MeanValue Loss/StdevValue Loss/MaxValue Loss/MinPolicy Loss/MeanPolicy Loss/StdevPolicy Loss/MaxPolicy Loss/Min
210.01.0772.01.0772.0772.00.00.0
320.01.0821.01.0821.01593.00.00.0
4340.00.0798.01.0798.02391.00.0-21.0-21.00.01.69254331.561665.47826050.0111572831.71780730.095908181.79162791.4786046-1.17884408221402671.2552065423802111.0642428398132324-3.2777271270751958-1.38114631.9222080000000001-0.023936661-5.87302060.341450270.4391212.78809930.00036954717000000004-0.131854890.92372730000000010.9732863000000002-4.1131425
5478.00.0755.01.0755.03146.00.0-21.0-21.00.02.47065110000000044.672685630.5836010.670717361.51119070.0241246951.54921089999999981.4397970.247637186641743450.81476908232049964.918219566345215-0.9567646980285645-4.6560935999999990.40140057-2.9237971000000003-5.87533140.4433551.704298110.8026889999999990.085763870.411623400000000031.30141399999999987.6323996-0.8151264
65116.00.0756.01.0756.03902.00.0-21.0-21.00.01.8506462.420251399999999716.2952880.42900051.32249640.0509453859999999951.43032731.20315580.141555700396859880.58144099429213253.3915529251098637-0.8861622810363771-3.01707270000000040.6280401999999999-2.1003609-4.33242370.223478620.8394185.3261850.053430060.227465660.822144874.685924-0.7187973000000001