1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-28 09:22:31 +01:00
Files
coach/rl_coach/traces/Pendulum_HAC/trace.csv
2018-10-02 17:55:16 +03:00

2.1 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinDiscounted Return/MeanDiscounted Return/StdevDiscounted Return/MaxDiscounted Return/MinEntropy/MeanEntropy/StdevEntropy/MaxEntropy/MinAdvantages/MeanAdvantages/StdevAdvantages/MaxAdvantages/MinValues/MeanValues/StdevValues/MaxValues/MinValue Loss/MeanValue Loss/StdevValue Loss/MaxValue Loss/MinPolicy Loss/MeanPolicy Loss/StdevPolicy Loss/MaxPolicy Loss/MinQ/MeanQ/StdevQ/MaxQ/MinTD targets/MeanTD targets/StdevTD targets/MaxTD targets/Minactions/Meanactions/Stdevactions/Maxactions/Min
210.01.097.01.025.025.00.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126
320.01.0194.02.025.050.00.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126
430.00.0291.03.025.075.0-0.013705192291281485-1000.0-1000.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126-0.048236250.0238062270000000030.021652615-0.10660829-0.128593843032536070.183259019152797440.04875503852963448-0.7279058694839478
540.00.0388.04.025.0100.0-0.02430443169727376-1000.0-1000.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126-0.040933970.0451653040.09869731-0.120250694-0.165439498679393640.26674595077401650.1111396551132202-1.4467874369949862
650.00.0485.05.025.0125.00.0-1000.0-1000.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126-0.0500595270.03256270.057268333-0.10848339-0.108422342888112060.18480785905930720.10341470901523106-0.6624982986866047