mirror of
https://github.com/gryf/coach.git
synced 2026-01-28 11:05:46 +01:00
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
6.6 KiB
6.6 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 486.0 | 486.0 | 486.0 | 486.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 573.0 | 573.0 | 87.0 | 573.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 4 | 3 | 0.0 | 1.0 | 722.0 | 722.0 | 149.0 | 722.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 5 | 4 | 0.0 | 1.0 | 1057.0 | 1057.0 | 335.0 | 1057.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 6 | 5 | 51.0 | 0.0 | 1260.0 | 1260.0 | 203.0 | 1260.0 | 0.9997990300000044 | 5.0 | 55.0 | 0.0 | 3.931745547874301 | 1.1304143308005733e-05 | 3.9317700862884517 | 3.9316890239715576 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0015866637 | 0.0023147496 | 0.009720516 | 0.0007565144 | |||||||
| 7 | 6 | 70.0 | 0.0 | 1335.0 | 1335.0 | 75.0 | 1335.0 | 0.999724780000006 | 2.0 | 15.0 | 0.0 | 3.931735929689909 | 7.235250669185713e-06 | 3.9317455291748047 | 3.931711912155152 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.001755927 | 0.0026373637 | 0.009641058000000001 | 0.0007148395500000001 | |||||||
| 8 | 7 | 91.0 | 0.0 | 1422.0 | 1422.0 | 87.0 | 1422.0 | 0.9996386500000078 | 1.0 | 15.0 | 0.0 | 3.931701206025623 | 5.099185273705093e-05 | 3.93173623085022 | 3.931553602218628 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.00282516 | 0.0035993564000000003 | 0.009418896999999999 | 0.00070036505 | |||||||
| 9 | 8 | 159.0 | 0.0 | 1693.0 | 1693.0 | 271.0 | 1693.0 | 0.9993703600000136 | 5.0 | 55.0 | 0.0 | 3.931656876606728 | 9.226522325134514e-05 | 3.931716203689575 | 3.9311599731445312 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0017165048 | 0.0027771054 | 0.012550428999999998 | 0.00063557597 | |||||||
| 10 | 9 | 201.0 | 0.0 | 1861.0 | 1861.0 | 168.0 | 1861.0 | 0.9992040400000172 | 3.0 | 50.0 | 0.0 | 3.93158137230646 | 0.00010474746316658029 | 3.93165135383606 | 3.931158781051636 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0017799592 | 0.002852903 | 0.009889938000000001 | 0.0006321495 | |||||||
| 11 | 10 | 279.0 | 0.0 | 2172.0 | 2172.0 | 311.0 | 2172.0 | 0.998896150000024 | 4.0 | 65.0 | 0.0 | 3.9314154325387416 | 0.00022279076273676361 | 3.931581258773804 | 3.930070638656616 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0026444625 | 0.004101368 | 0.019448647 | 0.00062298455 | |||||||
| 12 | 11 | 407.0 | 0.0 | 2683.0 | 2683.0 | 511.0 | 2683.0 | 0.9983902600000351 | 9.0 | 320.0 | 0.0 | 3.931240767240524 | 0.00015854553450050932 | 3.931373834609986 | 3.9305694103240967 | 0.0002500000000000001 | 1.0842021724855042e-19 | 0.00025 | 0.00025 | 0.0017362889000000002 | 0.0029191163 | 0.011232911000000002 | 0.0005518816 | 0.024129094928503625 | 0.00904320207528025 | 0.04169573485851341 | 0.012993395328522341 | |||
| 13 | 12 | 424.0 | 0.0 | 2754.0 | 2754.0 | 71.0 | 2754.0 | 0.9983199700000364 | 1.0 | 15.0 | 0.0 | 3.9310134579153617 | 0.000358822446065362 | 3.9312620162963863 | 3.93019700050354 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0032191304 | 0.0045680365999999995 | 0.011846525 | 0.0005631294 | |||||||
| 14 | 13 | 457.0 | 0.0 | 2886.0 | 2886.0 | 132.0 | 2886.0 | 0.9981892900000392 | 2.0 | 25.0 | 0.0 | 3.931020930409432 | 0.00018348520436412824 | 3.93116307258606 | 3.9301931858062735 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0016074966000000001 | 0.002708108 | 0.012189938999999999 | 0.00064319023 | |||||||
| 15 | 14 | 504.0 | 0.0 | 3074.0 | 3074.0 | 188.0 | 3074.0 | 0.9980031700000432 | 3.0 | 35.0 | 0.0 | 3.931025738301485 | 0.00015399033015145469 | 3.9311008453369136 | 3.9300034046173096 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0009836307000000001 | 0.0015528154999999999 | 0.011335916000000001 | 0.0005281837 | 0.021890531852842 | 0.014466196029404797 | 0.03762449622154298 | -0.00010702610015811408 | |||
| 16 | 15 | 528.0 | 0.0 | 3167.0 | 3167.0 | 93.0 | 3167.0 | 0.9979111000000452 | 1.0 | 15.0 | 0.0 | 3.931034368017445 | 6.681329033129075e-05 | 3.9310910701751713 | 3.9308130741119385 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0016258046 | 0.0031157003000000004 | 0.011724828999999999 | 0.0004990724 | |||||||
| 17 | 16 | 598.0 | 0.0 | 3449.0 | 3449.0 | 282.0 | 3449.0 | 0.9976319200000514 | 2.0 | 55.0 | 0.0 | 3.930865437643869 | 0.0002831683814269127 | 3.9310925006866455 | 3.9296932220458975 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0021639192000000003 | 0.0036756303 | 0.012480825 | 0.00047580682 | 0.0305326876540985 | 0.00923045073371262 | 0.0438561670482166 | 0.021990178897977525 | |||
| 18 | 17 | 668.0 | 0.0 | 3729.0 | 3729.0 | 280.0 | 3729.0 | 0.9973547200000574 | 4.0 | 45.0 | 0.0 | 3.930601644515991 | 0.0003480949830888473 | 3.9308652877807617 | 3.9292240142822266 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0027047140000000004 | 0.0042121527 | 0.013903748 | 0.0006515036 | 0.029686454000572925 | 0.01902240661079521 | 0.058909925818443724 | -0.002814809605478502 | |||
| 19 | 18 | 738.0 | 0.0 | 4008.0 | 4008.0 | 279.0 | 4008.0 | 0.9970785100000634 | 3.0 | 45.0 | 0.0 | 3.930277814183917 | 0.0005110864544539217 | 3.930636167526245 | 3.9281198978424072 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0036887865999999997 | 0.0051101656 | 0.02086526 | 0.0008114038499999999 | |||||||
| 20 | 19 | 792.0 | 0.0 | 4223.0 | 4223.0 | 215.0 | 4223.0 | 0.9968656600000679 | 4.0 | 90.0 | 0.0 | 3.930134093319928 | 0.00047399490243328624 | 3.930522918701172 | 3.9284942150115967 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0037553142999999997 | 0.004891396 | 0.0151908295 | 0.0010129308999999999 | 0.030484313207368687 | 0.012861465171913404 | 0.04339803867042122 | 0.011251759901643564 | |||
| 21 | 20 | 893.0 | 0.0 | 4630.0 | 4630.0 | 407.0 | 4630.0 | 0.9964627300000768 | 10.0 | 115.0 | 0.0 | 3.929736850285294 | 0.000766311293002325 | 3.930368661880493 | 3.926426649093628 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.004180377 | 0.006044741700000001 | 0.031209853 | 0.0009457401 | 0.0355303921426342 | 0.014215304187277478 | 0.05135452747345026 | 0.012931596487761264 | |||
| 22 | 21 | 937.0 | 0.0 | 4805.0 | 4805.0 | 175.0 | 4805.0 | 0.9962894800000806 | 1.0 | 5.0 | 0.0 | 3.930479177208834 | 0.00023689537153461093 | 3.930684328079224 | 3.929489135742188 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0021777323 | 0.0034656142999999998 | 0.014414535 | 0.00058991817 | |||||||
| 23 | 22 | 1022.0 | 0.0 | 5146.0 | 5146.0 | 341.0 | 5146.0 | 0.995951890000088 | 3.0 | 65.0 | 0.0 | 3.93046441078186 | 0.00035790579075833815 | 3.9307262897491455 | 3.929075956344605 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0024591651999999998 | 0.0041425275 | 0.014986968 | 0.00047587089999999996 | |||||||
| 24 | 23 | 1120.0 | 0.0 | 5538.0 | 5538.0 | 392.0 | 5538.0 | 0.9955638100000964 | 6.0 | 80.0 | 0.0 | 3.9300818811986864 | 0.0004955548155674863 | 3.930469751358032 | 3.928221464157105 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0027119769 | 0.0042804847 | 0.020022545 | 0.00057932467 | 0.035476823337376714 | 0.014014476077534626 | 0.04927191920578547 | 0.007030519843102157 | |||
| 25 | 24 | 1165.0 | 0.0 | 5718.0 | 5718.0 | 180.0 | 5718.0 | 0.9953856100001002 | 3.0 | 55.0 | 0.0 | 3.9300585714253513 | 0.00053408662096811 | 3.9305152893066406 | 3.928401231765747 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0033172776 | 0.004863477299999999 | 0.016010353 | 0.0006072133999999999 | |||||||
| 26 | 25 | 1190.0 | 0.0 | 5815.0 | 5815.0 | 97.0 | 5815.0 | 0.9952895800001024 | 0.0 | 0.0 | 0.0 | 3.9299856324990587 | 0.0004047671393024133 | 3.9302456378936768 | 3.9285967350006095 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0030563176 | 0.004401815999999999 | 0.016188376 | 0.0011075826 |