Balaji Subramaniam
|
bf2036b284
|
S3 optimization - save only the latest checkpoint. (#148)
|
2018-11-23 22:17:36 -08:00 |
|
Balaji Subramaniam
|
13d2679af4
|
Sync experiment dir, videos, gifs to S3. (#147)
|
2018-11-23 20:52:12 -08:00 |
|
Sina Afrooze
|
5332013bd1
|
Implement frame-work agnostic rollout and training workers (#137)
* Added checkpoint state file to coach checkpointing.
* Removed TF specific code from rollout_worker, training_worker, and s3_data_store
|
2018-11-23 18:05:44 -08:00 |
|
Cody Hsieh
|
dd18959e53
|
Don't download when checkpoint files are already present (#109)
* add check if checkpoint file present
|
2018-11-21 15:32:53 -08:00 |
|
Gal Leibovich
|
d4d06aaea6
|
remove kubernetes dependency (#117)
|
2018-11-18 18:10:22 +02:00 |
|
Ajay Deshpande
|
875d6ef017
|
Adding target reward and target sucess (#58)
* Adding target reward
* Adding target successs
* Addressing comments
* Using custom_reward_threshold and target_success_rate
* Adding exit message
* Moving success rate to environment
* Making target_success_rate optional
|
2018-11-12 15:03:43 -08:00 |
|
Ajay Deshpande
|
0f46877d7e
|
Adding steps and waiting for new checkpoint
|
2018-10-23 16:55:37 -04:00 |
|
Ajay Deshpande
|
5eac0102de
|
Changing exception type
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
a7f5442015
|
Adding should_train helper and should_train in graph_manager
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
a2e57a44f1
|
Getting only the model_checkpoint_path files
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
052bbc8f19
|
Adding lock in s3
|
2018-10-23 16:54:43 -04:00 |
|
Balaji Subramaniam
|
844a5af831
|
Make distributed coach work end-to-end.
- With data store, memory backend and orchestrator interfaces.
|
2018-10-23 16:54:43 -04:00 |
|
Balaji Subramaniam
|
1c238b4c60
|
Added data store backend. (#17)
* Added data store backend.
* Add NFS implementation for Kubernetes.
* Added S3 data store implementation.
* Addressed review comments.
|
2018-10-23 16:52:16 -04:00 |
|