1
0
mirror of https://github.com/gryf/coach.git synced 2026-01-30 20:35:47 +01:00
Files
coach/rl_coach/traces/Atari_A3C_pong/trace.csv
2018-08-20 13:01:30 +03:00

2.0 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinEntropy/MeanEntropy/StdevEntropy/MaxEntropy/MinAdvantages/MeanAdvantages/StdevAdvantages/MaxAdvantages/MinValues/MeanValues/StdevValues/MaxValues/MinValue Loss/MeanValue Loss/StdevValue Loss/MaxValue Loss/MinPolicy Loss/MeanPolicy Loss/StdevPolicy Loss/MaxPolicy Loss/Min
210.01.0881.01.0881.0881.00.00.0
320.01.01043.01.01043.01924.00.00.0
4340.00.0800.01.0800.02724.00.0-21.0-21.00.01.97870950000000011.68486189999999987.0084040.0107775351.69825839999999980.092537281.78662821.4752818-1.12557378842368181.0118381465546530.5801041126251221-3.300693988800049-0.5990110.99025910000000010.05652909-3.27630729999999960.152304140.305872321.92194000000000022.232377e-05-0.356788930.777039350.5012678-3.3208616
5484.00.0874.01.0874.03598.00.0-21.0-21.00.02.75999524.433537999999999530.4570050.317690551.63503660000000030.044284261.67240390000000021.49553060000000020.0124540618874809960.53163428373348192.479757070541382-0.97082257270813-1.89763410.45726654-1.5126665000000001-3.31470820.141395080.424580662.87899760000000040.013748560.0228902979999999960.72586149999999993.5482929-1.1909457
65133.00.0962.01.0962.04560.00.0-21.0-21.00.02.20676851.61164560000000018.9733110.218478981.6717770.0147268031.69109710000000011.6381558-0.0059082153763859180.39103916464633751.0921021699905396-0.9658074378967284-1.87321440.11304277-1.7097299-2.20332599999999970.079268890.098126350.53309120.0178629010.00268577669999999980.558070361.739695-1.2937489