1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-28 17:32:27 +01:00
Files
coach/rl_coach/traces/Pendulum_HAC/trace.csv
2018-08-20 13:01:30 +03:00

1.7 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinEntropy/MeanEntropy/StdevEntropy/MaxEntropy/MinAdvantages/MeanAdvantages/StdevAdvantages/MaxAdvantages/MinValues/MeanValues/StdevValues/MaxValues/MinValue Loss/MeanValue Loss/StdevValue Loss/MaxValue Loss/MinPolicy Loss/MeanPolicy Loss/StdevPolicy Loss/MaxPolicy Loss/MinQ/MeanQ/StdevQ/MaxQ/MinTD targets/MeanTD targets/StdevTD targets/MaxTD targets/Minactions/Meanactions/Stdevactions/Maxactions/Min
210.01.097.01.025.025.00.00.0
320.01.0194.02.025.050.00.00.0
430.00.0291.03.025.075.0-0.03819695695002292-1000.0-1000.00.0-0.058679120.0404271820.038633604-0.13119522-0.58758046511497150.98830346408811140.2924503923099136-3.1955509185791016
540.00.0388.04.025.0100.00.008508156342542239-1000.0-1000.00.0-0.049154620.0279656560000000020.015574882-0.11603892-0.53101393742228660.91502467530021130.2726461971315715-2.9480751131842533
650.00.0485.05.025.0125.00.0-1000.0-1000.00.0-0.0472917520.0276846179999999980.030320742999999997-0.11130883-0.56129017792562860.9294801520446980.23112091422080994-2.8455907461559957