Ajay Deshpande
|
33dc29ee99
|
Uploading checkpoint if crd provided (#191)
* Uploading checkpoint if crd provided
* Changing the calculation of total steps because of a recent change in core_types
Fixes #195
|
2019-04-26 12:27:33 -07:00 |
|
Gal Novik
|
fc6604c09c
|
added missing license headers
|
2018-11-27 22:43:40 +02:00 |
|
Balaji Subramaniam
|
d06197f663
|
Add documentation on distributed Coach. (#158)
* Added documentation on distributed Coach.
|
2018-11-27 12:26:15 +02:00 |
|
Balaji Subramaniam
|
bf2036b284
|
S3 optimization - save only the latest checkpoint. (#148)
|
2018-11-23 22:17:36 -08:00 |
|
Balaji Subramaniam
|
13d2679af4
|
Sync experiment dir, videos, gifs to S3. (#147)
|
2018-11-23 20:52:12 -08:00 |
|
Sina Afrooze
|
5332013bd1
|
Implement frame-work agnostic rollout and training workers (#137)
* Added checkpoint state file to coach checkpointing.
* Removed TF specific code from rollout_worker, training_worker, and s3_data_store
|
2018-11-23 18:05:44 -08:00 |
|
Cody Hsieh
|
dd18959e53
|
Don't download when checkpoint files are already present (#109)
* add check if checkpoint file present
|
2018-11-21 15:32:53 -08:00 |
|
Gal Leibovich
|
d4d06aaea6
|
remove kubernetes dependency (#117)
|
2018-11-18 18:10:22 +02:00 |
|
Balaji Subramaniam
|
dea1826658
|
Re-enable NFS data store. (#101)
|
2018-11-16 13:55:33 -08:00 |
|
Ajay Deshpande
|
875d6ef017
|
Adding target reward and target sucess (#58)
* Adding target reward
* Adding target successs
* Addressing comments
* Using custom_reward_threshold and target_success_rate
* Adding exit message
* Moving success rate to environment
* Making target_success_rate optional
|
2018-11-12 15:03:43 -08:00 |
|
Ajay Deshpande
|
9a30c26469
|
Adding improvements
|
2018-10-23 19:59:02 -04:00 |
|
Balaji Subramaniam
|
ca9015d8b1
|
Make NFS work end-to-end.
|
2018-10-23 16:55:37 -04:00 |
|
Ajay Deshpande
|
0f46877d7e
|
Adding steps and waiting for new checkpoint
|
2018-10-23 16:55:37 -04:00 |
|
Ajay Deshpande
|
5eac0102de
|
Changing exception type
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
a7f5442015
|
Adding should_train helper and should_train in graph_manager
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
a2e57a44f1
|
Getting only the model_checkpoint_path files
|
2018-10-23 16:54:43 -04:00 |
|
Ajay Deshpande
|
052bbc8f19
|
Adding lock in s3
|
2018-10-23 16:54:43 -04:00 |
|
Balaji Subramaniam
|
844a5af831
|
Make distributed coach work end-to-end.
- With data store, memory backend and orchestrator interfaces.
|
2018-10-23 16:54:43 -04:00 |
|
Balaji Subramaniam
|
1c238b4c60
|
Added data store backend. (#17)
* Added data store backend.
* Add NFS implementation for Kubernetes.
* Added S3 data store implementation.
* Addressed review comments.
|
2018-10-23 16:52:16 -04:00 |
|