1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-17 19:20:19 +01:00
Commit Graph

18 Commits

Author SHA1 Message Date
Guy Jacob
f52ff1784d Fix breaking change from minio update (#469)
`ResponseError` replaced by `S3Error` in new minio version
2020-12-15 10:02:16 +02:00
Zach Dwiel
7b0fccb041 Add RedisDataStore (#295)
* GraphManager.set_session also sets self.sess

* make sure that GraphManager.fetch_from_worker uses training phase

* remove unnecessary phase setting in training worker

* reorganize rollout worker

* provide default name to GlobalVariableSaver.__init__ since it isn't really used anyway

* allow dividing TrainingSteps and EnvironmentSteps

* add timestamps to the log

* added redis data store

* conflict merge fix
2019-08-28 21:15:58 +03:00
Ajay Deshpande
33dc29ee99 Uploading checkpoint if crd provided (#191)
* Uploading checkpoint if crd provided
* Changing the calculation of total steps because of a recent change in core_types

Fixes #195
2019-04-26 12:27:33 -07:00
Gal Novik
fc6604c09c added missing license headers 2018-11-27 22:43:40 +02:00
Balaji Subramaniam
d06197f663 Add documentation on distributed Coach. (#158)
* Added documentation on distributed Coach.
2018-11-27 12:26:15 +02:00
Balaji Subramaniam
bf2036b284 S3 optimization - save only the latest checkpoint. (#148) 2018-11-23 22:17:36 -08:00
Balaji Subramaniam
13d2679af4 Sync experiment dir, videos, gifs to S3. (#147) 2018-11-23 20:52:12 -08:00
Sina Afrooze
5332013bd1 Implement frame-work agnostic rollout and training workers (#137)
* Added checkpoint state file to coach checkpointing.

* Removed TF specific code from rollout_worker, training_worker, and s3_data_store
2018-11-23 18:05:44 -08:00
Cody Hsieh
dd18959e53 Don't download when checkpoint files are already present (#109)
* add check if checkpoint file present
2018-11-21 15:32:53 -08:00
Gal Leibovich
d4d06aaea6 remove kubernetes dependency (#117) 2018-11-18 18:10:22 +02:00
Ajay Deshpande
875d6ef017 Adding target reward and target sucess (#58)
* Adding target reward

* Adding target successs

* Addressing comments

* Using custom_reward_threshold and target_success_rate

* Adding exit message

* Moving success rate to environment

* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Ajay Deshpande
0f46877d7e Adding steps and waiting for new checkpoint 2018-10-23 16:55:37 -04:00
Ajay Deshpande
5eac0102de Changing exception type 2018-10-23 16:54:43 -04:00
Ajay Deshpande
a7f5442015 Adding should_train helper and should_train in graph_manager 2018-10-23 16:54:43 -04:00
Ajay Deshpande
a2e57a44f1 Getting only the model_checkpoint_path files 2018-10-23 16:54:43 -04:00
Ajay Deshpande
052bbc8f19 Adding lock in s3 2018-10-23 16:54:43 -04:00
Balaji Subramaniam
844a5af831 Make distributed coach work end-to-end.
- With data store, memory backend and orchestrator interfaces.
2018-10-23 16:54:43 -04:00
Balaji Subramaniam
1c238b4c60 Added data store backend. (#17)
* Added data store backend.
* Add NFS implementation for Kubernetes.
* Added S3 data store implementation.
* Addressed review comments.
2018-10-23 16:52:16 -04:00