mirror of
https://github.com/gryf/coach.git
synced 2026-01-29 03:25:47 +01:00
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
11 KiB
11 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Advantages/Mean | Advantages/Stdev | Advantages/Max | Advantages/Min | Values/Mean | Values/Stdev | Values/Max | Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min | Policy Loss/Mean | Policy Loss/Stdev | Policy Loss/Max | Policy Loss/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 172.0 | 1.0 | 172.0 | 172.0 | 0.0 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 79.0 | 1.0 | 79.0 | 251.0 | 0.0 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 4 | 3 | 0.0 | 1.0 | 96.0 | 1.0 | 96.0 | 347.0 | 0.0 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 5 | 4 | 0.0 | 1.0 | 371.0 | 1.0 | 371.0 | 718.0 | 0.0 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 6 | 5 | 0.0 | 1.0 | 344.0 | 1.0 | 344.0 | 1062.0 | 0.0 | 0.0 | |||||||||||||||||||||||||||||||||||||
| 7 | 6 | 12.0 | 0.0 | 254.0 | 1.0 | 254.0 | 1316.0 | 0.0 | 6.0 | 80.0 | 0.0 | 0.19567649 | 0.2205969 | 0.6815422 | 0.00015108055 | 1.7916273999999999 | 4.0483294e-05 | 1.7916965 | 1.7915238999999998 | 0.2248260026458108 | 0.4042464626215307 | 1.0014296770095823 | -0.01052277535200119 | 0.017354053 | 0.01462093 | 0.050496485 | -0.003664921 | 0.10698097 | 0.12332857400000001 | 0.38313812 | 4.9710330000000006e-08 | 0.40305725 | 0.47160277 | 1.4861937 | -0.009990961999999999 | |||||||||||
| 8 | 7 | 28.0 | 0.0 | 310.0 | 1.0 | 310.0 | 1626.0 | 0.0 | 1.0 | 30.0 | 0.0 | 0.056821294 | 0.1850466 | 0.74911565 | 0.002254054 | 1.7915976000000002 | 4.443067e-05 | 1.7916876 | 1.7914053 | 0.04728503023584684 | 0.219276227376429 | 0.9940330386161804 | -0.02618091553449631 | 0.06592112 | 0.005795096 | 0.07936180400000001 | 0.05113762 | 0.02515897 | 0.09394965 | 0.37668633 | 4.6014825e-06 | 0.08473615 | 0.37085986 | 1.4722064 | -0.02690432 | |||||||||||
| 9 | 8 | 35.0 | 0.0 | 130.0 | 1.0 | 130.0 | 1756.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.008976445500000001 | 0.0029242659999999996 | 0.014560096000000002 | 0.0052981568 | 1.7915395 | 6.856453e-05 | 1.7916580000000002 | 1.791449 | -0.008989325724542141 | 0.006062908159684999 | 0.0016912072896957395 | -0.029597371816635125 | 0.090277284 | 0.0039322246 | 0.09905626 | 0.083528996 | 5.8783422e-05 | 4.0506467e-05 | 0.00014373315 | 1.9011064e-05 | -0.016116107 | 0.0053216093 | -0.00930417 | -0.026238699 | |||||||||||
| 10 | 9 | 46.0 | 0.0 | 216.0 | 1.0 | 216.0 | 1972.0 | 0.0 | 3.0 | 20.0 | 0.0 | 0.10452251 | 0.23807605 | 0.81302756 | 0.0076934476 | 1.7915101999999998 | 6.7988265e-05 | 1.7916223999999998 | 1.7912793999999999 | 0.07167001135647298 | 0.3004298947139229 | 1.860807538032532 | -0.03583203256130218 | 0.12916516 | 0.0068457909999999995 | 0.14647742 | 0.11421147 | 0.047697347 | 0.12782575 | 0.42886597 | 4.2765655999999996e-05 | 0.12837012 | 0.40204559999999995 | 1.3241034999999999 | -0.035978295 | |||||||||||
| 11 | 10 | 54.0 | 0.0 | 151.0 | 1.0 | 151.0 | 2123.0 | 0.0 | 3.0 | 45.0 | 0.0 | 0.20279299 | 0.23925558 | 0.6407468000000001 | 0.015352413 | 1.7914312 | 7.839572e-05 | 1.7915441000000003 | 1.791204 | 0.14761355881180085 | 0.3582254799901992 | 0.9790486097335817 | -0.04357487708330154 | 0.15072754 | 0.0058448114 | 0.16560993 | 0.1391081 | 0.075057626 | 0.095039055 | 0.23537856 | 8.976024400000001e-05 | 0.26476184 | 0.37308687 | 0.9004108000000001 | -0.032910552 | |||||||||||
| 12 | 11 | 59.0 | 0.0 | 86.0 | 1.0 | 86.0 | 2209.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.022077147000000002 | 0.0023909544 | 0.024708282 | 0.018208042 | 1.7914548 | 4.2731140000000004e-05 | 1.7915742 | 1.7913142 | -0.017075253836810588 | 0.010092539052040822 | -0.0002947151660919189 | -0.05733919143676758 | 0.17155758 | 0.00684912 | 0.19188917 | 0.16236548 | 0.00019671183 | 5.3809068e-05 | 0.00027407217 | 0.00012477461000000002 | -0.030653037 | 0.003763158 | -0.02515122 | -0.03556459 | |||||||||||
| 13 | 12 | 72.0 | 0.0 | 243.0 | 1.0 | 243.0 | 2452.0 | 0.0 | 4.0 | 50.0 | 0.0 | 0.30084065 | 0.40614513 | 1.2229686000000002 | 0.02054095 | 1.7910703000000001 | 0.0001751061 | 1.7913968999999996 | 1.790586 | 0.1536969940488537 | 0.3659362280292968 | 0.9784829616546632 | -0.07325080037117004 | 0.2755594 | 0.04179078 | 0.36495075 | 0.22700338 | 0.07876604 | 0.120651804 | 0.35853273 | 0.00015978249 | 0.27541003 | 0.49573959999999995 | 1.4355379 | -0.065343626 | |||||||||||
| 14 | 13 | 78.0 | 0.0 | 117.0 | 1.0 | 117.0 | 2569.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.085860215 | 0.0061960240000000005 | 0.09245826 | 0.07558973 | 1.7905262 | 0.00015552709 | 1.7908181 | 1.7902833999999999 | -0.04054099202156067 | 0.0224406094197929 | -0.002068936824798584 | -0.08899098634719849 | 0.3962928 | 0.008973127 | 0.42356229999999995 | 0.38356206 | 0.0010735764999999999 | 0.00016344577 | 0.0012648260000000001 | 0.0008468464 | -0.07239084 | 0.0038452593 | -0.06634041 | -0.07710241 | |||||||||||
| 15 | 14 | 83.0 | 0.0 | 92.0 | 1.0 | 92.0 | 2661.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.08987115 | 0.05077347 | 0.17710942 | 0.05345255 | 1.7906991 | 0.00014593783 | 1.7908918999999999 | 1.7901826 | -0.04640358872711658 | 0.0365554518574147 | -0.001846909523010254 | -0.14066559076309204 | 0.32658878 | 0.037642512 | 0.41903177 | 0.2930116 | 0.0017447972 | 0.0018669254 | 0.0049749226 | 0.0005580194400000001 | -0.08308937400000001 | 0.04248289 | -0.052891272999999996 | -0.15625969 | |||||||||||
| 16 | 15 | 96.0 | 0.0 | 250.0 | 1.0 | 250.0 | 2911.0 | 0.0 | 4.0 | 30.0 | 0.0 | 0.24804004 | 0.32625040000000005 | 0.8471751000000001 | 0.027932685 | 1.7906308000000002 | 0.00016160042 | 1.7909075 | 1.7899588 | 0.09268262100716433 | 0.3162697034944009 | 0.9642215371131896 | -0.0929085910320282 | 0.35738194 | 0.015382448 | 0.39524317 | 0.32462870000000005 | 0.05430830000000001 | 0.08928497 | 0.22784210000000002 | 0.0003870035 | 0.16750437 | 0.37638140000000003 | 0.8883538000000001 | -0.078577496 | |||||||||||
| 17 | 16 | 105.0 | 0.0 | 173.0 | 1.0 | 173.0 | 3084.0 | 0.0 | 3.0 | 60.0 | 0.0 | 0.36758524 | 0.4459755 | 1.4121901000000001 | 0.061634187 | 1.7908936000000002 | 0.000116576695 | 1.7911555000000001 | 1.7905552 | 0.13567016897723078 | 0.3572671734435903 | 0.9771864414215088 | -0.16354811191558838 | 0.35916020000000004 | 0.036212217000000005 | 0.43366322 | 0.29071537 | 0.07302311 | 0.10714984 | 0.315386 | 0.0006234375 | 0.2417444 | 0.46547025 | 1.3016641000000002 | -0.08348486 | |||||||||||
| 18 | 17 | 110.0 | 0.0 | 100.0 | 1.0 | 100.0 | 3184.0 | 0.0 | 1.0 | 30.0 | 0.0 | 0.5112869999999999 | 0.7842299 | 1.8688575 | 0.021758934 | 1.7906313000000003 | 0.0001870035 | 1.7910262 | 1.7903508000000001 | 0.1910425681620836 | 0.3908652208594236 | 0.993660807609558 | -0.08672672510147095 | 0.36392245 | 0.02695719 | 0.41229507 | 0.32356548 | 0.09463646 | 0.16288367 | 0.37675855 | 9.845499e-05 | 0.34286004 | 0.67674136 | 1.5145043 | -0.07307689 | |||||||||||
| 19 | 18 | 121.0 | 0.0 | 215.0 | 1.0 | 215.0 | 3399.0 | 0.0 | 1.0 | 5.0 | 0.0 | 0.28810929999999996 | 0.39181823 | 1.459698 | 0.08809837 | 1.788362 | 0.00027721387000000005 | 1.7888026000000001 | 1.7874539 | 0.0051032425463199615 | 0.23262806622008686 | 0.9417458772659302 | -0.13614678382873535 | 0.62248904 | 0.018839955 | 0.6876956 | 0.5977665 | 0.02707093 | 0.07409433 | 0.2493394 | 0.00083600805 | 0.012020485 | 0.35114133 | 1.0632533000000002 | -0.14330262 | |||||||||||
| 20 | 19 | 126.0 | 0.0 | 92.0 | 1.0 | 92.0 | 3491.0 | 0.0 | 2.0 | 25.0 | 0.0 | 0.49123952 | 0.42201295 | 1.1937186000000002 | 0.13654065 | 1.7897191000000003 | 0.00025277978 | 1.7900676 | 1.7893028000000002 | 0.12664920873939992 | 0.3682718359779862 | 0.9198432564735411 | -0.12279212474822998 | 0.58103245 | 0.027373647 | 0.6279676 | 0.5478466 | 0.07583208 | 0.0827045 | 0.2031532 | 0.0018784435999999998 | 0.2255587 | 0.37950984 | 0.8095318 | -0.12593104 | |||||||||||
| 21 | 20 | 132.0 | 0.0 | 104.0 | 1.0 | 104.0 | 3595.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.11557531 | 0.01757393 | 0.13543893 | 0.08959412 | 1.7901121 | 9.133325500000001e-05 | 1.790232 | 1.7898425999999998 | -0.0482511255145073 | 0.029268962943657925 | 0.009157657623291016 | -0.12470829486846925 | 0.4982148 | 0.032707598 | 0.5453865999999999 | 0.4553813 | 0.0015924217000000001 | 0.00035940146000000003 | 0.002020723 | 0.0011408231 | -0.086312905 | 0.012666578999999999 | -0.06612517 | -0.099929444 | |||||||||||
| 22 | 21 | 157.0 | 0.0 | 486.0 | 1.0 | 486.0 | 4081.0 | 0.0 | 4.0 | 50.0 | 0.0 | 0.2326429 | 0.42413315 | 1.4717642 | 0.011350347 | 1.7909327000000002 | 0.0004338614 | 1.7914413 | 1.7897542000000002 | 0.06549707564214867 | 0.3132819970573645 | 1.830270767211914 | -0.14497065544128418 | 0.34287536 | 0.07915851 | 0.48324449999999997 | 0.21450326 | 0.051217735 | 0.1368152 | 0.53446305 | 4.279909000000001e-05 | 0.117416285 | 0.5005594 | 1.6238992 | -0.19046536 | |||||||||||
| 23 | 22 | 162.0 | 0.0 | 97.0 | 1.0 | 97.0 | 4178.0 | 0.0 | 2.0 | 55.0 | 0.0 | 0.5230014000000001 | 0.45992133 | 1.039975 | 0.063883886 | 1.7907598000000002 | 0.00013174287 | 1.7909822 | 1.7903869 | 0.2461168095469475 | 0.4223010358740354 | 0.9653756022453308 | -0.06252440810203552 | 0.32046944 | 0.008105095500000001 | 0.34606874 | 0.30065367 | 0.11945581 | 0.11948234 | 0.25733579999999995 | 0.0006940061 | 0.44450063 | 0.5093702999999999 | 1.0379127000000001 | -0.06295861 | |||||||||||
| 24 | 23 | 167.0 | 0.0 | 95.0 | 1.0 | 95.0 | 4273.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.07797143 | 0.006253591 | 0.08500799 | 0.0688162 | 1.7905513 | 0.00011041841999999999 | 1.7908473999999999 | 1.7903417000000001 | -0.03700839839875698 | 0.01797619761499475 | -0.0031774044036865234 | -0.07612943649291992 | 0.37993726 | 0.021876942000000003 | 0.41198814 | 0.3481909 | 0.0008463827 | 0.00015115745 | 0.001046534 | 0.00063928636 | -0.06634172 | 0.00587621 | -0.058958582999999995 | -0.07316028 | |||||||||||
| 25 | 24 | 178.0 | 0.0 | 217.0 | 1.0 | 217.0 | 4490.0 | 0.0 | 3.0 | 40.0 | 0.0 | 0.30123249999999996 | 0.41637412 | 1.1756048000000001 | 0.03484112 | 1.7904611000000001 | 0.00042778594000000006 | 1.7911371999999999 | 1.7896264 | 0.07372549414634705 | 0.3136862116853723 | 1.7823173999786377 | -0.1246633529663086 | 0.47646698 | 0.022771806000000002 | 0.53692746 | 0.4293259 | 0.051917247 | 0.10253852 | 0.28872594 | 0.0003387631 | 0.13538794 | 0.4160974 | 1.017875 | -0.09954969 | |||||||||||
| 26 | 25 | 183.0 | 0.0 | 82.0 | 1.0 | 82.0 | 4572.0 | 0.0 | 1.0 | 5.0 | 0.0 | 0.28755513 | 0.31245372 | 0.82867634 | 0.103177786 | 1.7910616000000001 | 8.027599e-05 | 1.7911837 | 1.7908571 | 0.060586804524064064 | 0.29926683817598804 | 0.9531143307685852 | -0.08926260471343994 | 0.46090984 | 0.005631511999999999 | 0.48508304 | 0.4509009 | 0.046615697000000005 | 0.07826726 | 0.18217851 | 0.001346837 | 0.10982244 | 0.33842890000000003 | 0.6959845 | -0.08936332 | |||||||||||
| 27 | 26 | 193.0 | 0.0 | 186.0 | 1.0 | 186.0 | 4758.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.099181354 | 0.060111597 | 0.25132 | 0.042370647000000004 | 1.7913789 | 5.84025e-05 | 1.7915074 | 1.7912476000000002 | -0.046101675927639016 | 0.03497843538935545 | 0.01036137342453003 | -0.14493098855018616 | 0.3926531 | 0.058744658 | 0.47917265 | 0.31410804 | 0.0016744278 | 0.0019926373 | 0.0070708264 | 0.00042033980000000004 | -0.08237186 | 0.04802715 | -0.039025433 | -0.20672412 | |||||||||||
| 28 | 27 | 215.0 | 0.0 | 440.0 | 1.0 | 440.0 | 5198.0 | 0.0 | 6.0 | 105.0 | 0.0 | 0.37764040000000004 | 0.6469134 | 2.5495335999999997 | 0.020546338 | 1.7905251 | 0.00035596412 | 1.7912682 | 1.7898917 | 0.12925171518609638 | 0.4019052044623574 | 1.8412196636199951 | -0.19034543633461 | 0.40561888 | 0.045567162 | 0.5183209 | 0.33286357 | 0.0891169 | 0.21326724 | 0.9306774999999999 | 0.00015935848999999998 | 0.2303355 | 0.63939536 | 2.3170965 | -0.110453404 | |||||||||||
| 29 | 28 | 227.0 | 0.0 | 238.0 | 1.0 | 238.0 | 5436.0 | 0.0 | 4.0 | 50.0 | 0.0 | 0.53855324 | 0.85370946 | 3.0082095 | 0.06336482 | 1.7894063000000002 | 0.0003872629 | 1.7902164 | 1.7885029 | 0.14335698945955794 | 0.4462530209518707 | 1.839568018913269 | -0.13642624020576474 | 0.51286656 | 0.03900573 | 0.5935630999999999 | 0.43189234 | 0.10984649999999999 | 0.2433033 | 0.848766 | 0.0005528301000000001 | 0.25607142 | 0.6681634 | 2.149166 | -0.12197556 | |||||||||||
| 30 | 29 | 232.0 | 0.0 | 86.0 | 1.0 | 86.0 | 5522.0 | 0.0 | 2.0 | 55.0 | 0.0 | 0.6893305999999999 | 0.8764063 | 2.2055849999999997 | 0.123763815 | 1.7884814999999998 | 0.00022844973 | 1.7890648 | 1.7881353 | 0.17138415090739728 | 0.4112120401154026 | 1.0042281150817869 | -0.16397744417190552 | 0.600002 | 0.017916113 | 0.65256387 | 0.5725639 | 0.099233955 | 0.1462143 | 0.351234 | 0.0024167297 | 0.305196 | 0.64387214 | 1.4128844999999999 | -0.13909039 | |||||||||||
| 31 | 30 | 242.0 | 0.0 | 184.0 | 1.0 | 184.0 | 5706.0 | 0.0 | 2.0 | 30.0 | 0.0 | 0.3528114 | 0.30696386 | 1.1932287 | 0.14946677 | 1.7857898 | 0.00069586793 | 1.7870337 | 1.7839484 | -0.026260635256767268 | 0.21069580671792834 | 0.8891516923904419 | -0.2381856441497803 | 0.8954148999999999 | 0.055663843 | 1.0139464 | 0.8006623 | 0.022541171000000002 | 0.046224325999999996 | 0.15216036 | 0.0017163947 | -0.047232665 | 0.22607273 | 0.582497 | -0.19729005 |