Zach Dwiel
|
18d84c5037
|
remove unnecessary timers from GraphManager
|
2018-10-23 16:58:17 -04:00 |
|
Zach Dwiel
|
cd30efe52e
|
remove unnecessary test result is None in GraphManager.act
|
2018-10-23 16:57:43 -04:00 |
|
Zach Dwiel
|
35d67cbd9b
|
use phase context in GraphManager.evaluate
|
2018-10-23 16:57:43 -04:00 |
|
Zach Dwiel
|
d3c341147e
|
simplify GraphManager.act by removing arguments: continue_until_game_over and return_on_game_over
|
2018-10-23 16:57:43 -04:00 |
|
Zach Dwiel
|
8be980912c
|
fixed typo from earlier commit
|
2018-10-23 16:57:43 -04:00 |
|
Zach Dwiel
|
fbaf19543e
|
capture stdout during preset tests
|
2018-10-23 16:57:43 -04:00 |
|
Zach Dwiel
|
517aac163a
|
introduce graph_manager.phase_context; make sure that calls to graph_manager.train automatically set training phase
|
2018-10-23 16:57:43 -04:00 |
|
Zach Dwiel
|
7382a142bb
|
remove unused steps parameter from GraphManager.train
|
2018-10-23 16:57:06 -04:00 |
|
Zach Dwiel
|
97f608ee5e
|
reorder failing presets
|
2018-10-23 16:57:05 -04:00 |
|
Zach Dwiel
|
ad68fa263d
|
remove property GraphManager.training_start_time
|
2018-10-23 16:57:05 -04:00 |
|
Zach Dwiel
|
bfc320cf83
|
disable failing tests for now
|
2018-10-23 16:57:05 -04:00 |
|
Zach Dwiel
|
01f3a0594b
|
remove return values from GraphManager.act
|
2018-10-23 16:57:05 -04:00 |
|
Zach Dwiel
|
b02f269464
|
graph_manager:heatup uses total_steps_counters looping mechanism like other loops. graph_manager:act no longer needs to return any values
|
2018-10-23 16:57:05 -04:00 |
|
Balaji Subramaniam
|
ca9015d8b1
|
Make NFS work end-to-end.
|
2018-10-23 16:55:37 -04:00 |
|
Ajay Deshpande
|
fb1039fcb5
|
Checkpoint and evaluation optimizations
|
2018-10-23 16:55:37 -04:00 |
|
Ajay Deshpande
|
b285a02023
|
Adding parameteres, checking transitions before training
|
2018-10-23 16:55:37 -04:00 |
|
Ajay Deshpande
|
0f46877d7e
|
Adding steps and waiting for new checkpoint
|
2018-10-23 16:55:37 -04:00 |
|
Ajay Deshpande
|
0e121c5762
|
Ignoring redis sub if testing
|
2018-10-23 16:55:37 -04:00 |
|
Ajay Deshpande
|
7f00235ed5
|
waiting for a new checkpoint if it's available
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
5eac0102de
|
Changing exception type
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
a7f5442015
|
Adding should_train helper and should_train in graph_manager
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
a2e57a44f1
|
Getting only the model_checkpoint_path files
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
052bbc8f19
|
Adding lock in s3
|
2018-10-23 16:54:43 -04:00 |
|
Balaji Subramaniam
|
844a5af831
|
Make distributed coach work end-to-end.
- With data store, memory backend and orchestrator interfaces.
|
2018-10-23 16:54:43 -04:00 |
|
Zach Dwiel
|
9f92064e67
|
cleanup graph_manager:act
|
2018-10-23 16:53:32 -04:00 |
|
Zach Dwiel
|
b5305bd075
|
update dockerfile
|
2018-10-23 16:52:16 -04:00 |
|
Zach Dwiel
|
950f261201
|
extract method all_presets
|
2018-10-23 16:52:16 -04:00 |
|
Zach Dwiel
|
ed3a3b39be
|
add comments
|
2018-10-23 16:52:16 -04:00 |
|
Zach Dwiel
|
04038c9f40
|
improve integration test output format
|
2018-10-23 16:52:16 -04:00 |
|
Balaji Subramaniam
|
1c238b4c60
|
Added data store backend. (#17)
* Added data store backend.
* Add NFS implementation for Kubernetes.
* Added S3 data store implementation.
* Addressed review comments.
|
2018-10-23 16:52:16 -04:00 |
|
Ajay Deshpande
|
6b2de6ba6d
|
Adding initial interface for backend and redis pubsub (#19)
* Adding initial interface for backend and redis pubsub
* Addressing comments, adding super in all memories
* Removing distributed experience replay
|
2018-10-23 16:51:48 -04:00 |
|
Zach Dwiel
|
a54ef2757f
|
ignore deprecation warnings in test logging
|
2018-10-23 16:51:48 -04:00 |
|
Zach Dwiel
|
acc7f70de3
|
enumerate each preset as its own test
|
2018-10-23 16:51:48 -04:00 |
|
Zach Dwiel
|
1e83a27bee
|
update dockerfile and makefile
|
2018-10-23 16:51:48 -04:00 |
|
Zach Dwiel
|
67faa80ea0
|
allow custom number of training steps
|
2018-10-23 16:51:48 -04:00 |
|
Zach Dwiel
|
d69332efd4
|
fixed bug in training worker
|
2018-10-23 16:51:48 -04:00 |
|
Zach Dwiel
|
cd733b2404
|
add support for running kubernetes orchestrator from behind proxy
|
2018-10-23 16:51:48 -04:00 |
|
Zach Dwiel
|
ad4d2c3053
|
add make stop_kubernetes
|
2018-10-23 16:51:48 -04:00 |
|
Zach Dwiel
|
5e85a0f972
|
use the number of heat up steps specified in schedule parameters
|
2018-10-23 16:51:48 -04:00 |
|
Ajay Deshpande
|
98850464cc
|
Adding nfs pv, pvc, waiting for memory to be full
|
2018-10-23 16:50:48 -04:00 |
|
Zach Dwiel
|
13d81f65b9
|
add redis options to training worker
|
2018-10-23 16:47:46 -04:00 |
|
Zach Dwiel
|
04f32a0f02
|
add heatup step to training worker
|
2018-10-23 16:47:46 -04:00 |
|
Zach Dwiel
|
7c1f0dce4f
|
include registry in image name
|
2018-10-23 16:47:46 -04:00 |
|
Zach Dwiel
|
0812a94fbd
|
first pass at kubernetes
|
2018-10-23 16:47:46 -04:00 |
|
Zach Dwiel
|
3328b25549
|
reenable redis; better error message
|
2018-10-23 16:47:46 -04:00 |
|
Zach Dwiel
|
009cf670f3
|
fix simple typos; temporarily disable redis in rollout worker
|
2018-10-23 16:47:46 -04:00 |
|
Zach Dwiel
|
f5b7122d56
|
weight for checkpoint before trying to start rollout worker
|
2018-10-23 16:47:46 -04:00 |
|
Zach Dwiel
|
4352d6735d
|
add training worker
|
2018-10-23 16:47:46 -04:00 |
|
Ajay Deshpande
|
28926bf2a4
|
Changing parameters
|
2018-10-23 16:47:46 -04:00 |
|
Ajay Deshpande
|
c2991819b4
|
Adding right arguments to the agent
|
2018-10-23 16:46:04 -04:00 |
|