1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-29 01:42:28 +01:00
Files
coach/rl_coach/traces/Pendulum_HAC/trace.csv
itaicaspi-intel fa4895f840 new traces
2018-09-13 11:47:36 +03:00

2.1 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinDiscounted Return/MeanDiscounted Return/StdevDiscounted Return/MaxDiscounted Return/MinEntropy/MeanEntropy/StdevEntropy/MaxEntropy/MinAdvantages/MeanAdvantages/StdevAdvantages/MaxAdvantages/MinValues/MeanValues/StdevValues/MaxValues/MinValue Loss/MeanValue Loss/StdevValue Loss/MaxValue Loss/MinPolicy Loss/MeanPolicy Loss/StdevPolicy Loss/MaxPolicy Loss/MinQ/MeanQ/StdevQ/MaxQ/MinTD targets/MeanTD targets/StdevTD targets/MaxTD targets/Minactions/Meanactions/Stdevactions/Maxactions/Min
210.01.097.01.025.025.00.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126
320.01.0194.02.025.050.00.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126
430.00.0291.03.025.075.0-0.013705192291281485-1000.0-1000.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126-0.212673890.09754460000000001-0.103998035-0.46418846-0.26776152181401080.92213901395732921.6445876359939575-4.0093758958623535
540.00.0388.04.025.0100.0-0.02430443169727376-1000.0-1000.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126-0.55209720.33804613-0.16024278-1.3700855-0.0598340327908821752.09659053927372075.847123362995334-9.20645523071289
650.00.0485.05.025.0125.00.0-1000.0-1000.00.0-480.6903328824848254.92315277284558-40.0-888.7145624034126-0.274972440.19105045-0.08130584-0.90415394-0.33903621903235861.35395675882930691.7026247944415995-7.820933549975827