1
0
mirror of https://github.com/gryf/coach.git synced 2026-02-13 12:25:47 +01:00
Files
coach/rl_coach/traces/Atari_DDQN_pong/trace.csv
2018-08-20 13:01:30 +03:00

1.7 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinQ/MeanQ/StdevQ/MaxQ/Min
210.01.01117.01117.01117.01117.01.00.0
32210.00.01958.01958.0841.01958.00.999167410000018-20.0-20.00.00.0117569080996049930.012456463107200480.053872343152761460.000106897568912245320.00025000000000000011.0842021724855042e-190.000250.000250.0579620380.046168964000000010.262088540.0071766186
43402.00.02726.02726.0768.02726.00.9984070900000346-21.0-21.00.00.0128093550092671650.0137711320113211130.079750336706638345.99101695115678e-050.00025000000000000015.421010862427521e-200.000250.000250.0520513240.0283593090.176581950.008862591-0.0174261280.0060299635-0.008042792-0.026319288
54601.00.03519.03519.0793.03519.00.9976220200000516-21.0-21.00.00.0152723125435690370.0136720841537999150.056282840669155120.000234157545492053030.00025000000000000015.421010862427521e-200.000250.000250.0523141250.0233369970.14734580.012913031-0.0315593150.0042713494-0.023393027-0.036418874
65809.00.04352.04352.0833.04352.00.9967973500000696-21.0-21.00.00.0130827997351074240.012553743348460980.065672598779201510.00047015229938551780.00025000000000000011.0842021724855042e-190.000250.000250.0432658570000000050.0145349170.0896559060.017195849-0.00533070740.0027605025-0.0019208845999999999-0.00974094