1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-17 19:20:19 +01:00
Commit Graph

34 Commits

Author SHA1 Message Date
shadiendrawis
6ca91b9090 add reset internal state to rollout worker (#421) 2019-11-03 14:42:51 +02:00
Zach Dwiel
7b0fccb041 Add RedisDataStore (#295)
* GraphManager.set_session also sets self.sess

* make sure that GraphManager.fetch_from_worker uses training phase

* remove unnecessary phase setting in training worker

* reorganize rollout worker

* provide default name to GlobalVariableSaver.__init__ since it isn't really used anyway

* allow dividing TrainingSteps and EnvironmentSteps

* add timestamps to the log

* added redis data store

* conflict merge fix
2019-08-28 21:15:58 +03:00
Ajay Deshpande
33dc29ee99 Uploading checkpoint if crd provided (#191)
* Uploading checkpoint if crd provided
* Changing the calculation of total steps because of a recent change in core_types

Fixes #195
2019-04-26 12:27:33 -07:00
zach dwiel
54fdfe2da8 simplify rollout worker steps with new magic methods on StepMethod 2019-04-09 12:14:27 -04:00
zach dwiel
83da5cde2f remove unnecessary parentheses 2019-04-09 12:14:27 -04:00
zach dwiel
dddaefb210 fixed bug in rollout worker where total number of improved steps are not taken 2019-04-09 12:14:27 -04:00
Gal Leibovich
d6158a5cfc restoring from a checkpoint file (#247) 2019-03-17 16:28:09 +02:00
Zach Dwiel
fedb4cbd7c Cleanup and refactoring (#171) 2019-01-15 10:04:53 +02:00
Gal Novik
fc6604c09c added missing license headers 2018-11-27 22:43:40 +02:00
Sina Afrooze
5332013bd1 Implement frame-work agnostic rollout and training workers (#137)
* Added checkpoint state file to coach checkpointing.

* Removed TF specific code from rollout_worker, training_worker, and s3_data_store
2018-11-23 18:05:44 -08:00
Ajay Deshpande
4a6c404070 Adding worker logs and plumbed task_parameters to distributed coach (#130) 2018-11-23 15:35:11 -08:00
Balaji Subramaniam
dea1826658 Re-enable NFS data store. (#101) 2018-11-16 13:55:33 -08:00
Balaji Subramaniam
101c55d37d Handle both Environment Steps and Episodes on the subscriber side. (#99) 2018-11-15 14:42:21 -08:00
Ajay Deshpande
fde73ced13 Simulating the act on the trainer. (#65)
* Remove the use of daemon threads for Redis subscribe.
* Emulate act and observe on trainer side to update internal vars.
2018-11-15 08:38:58 -08:00
Ajay Deshpande
875d6ef017 Adding target reward and target sucess (#58)
* Adding target reward

* Adding target successs

* Addressing comments

* Using custom_reward_threshold and target_success_rate

* Adding exit message

* Moving success rate to environment

* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Balaji Subramaniam
7e7006305a Integrate coach.py params with distributed Coach. (#42)
* Integrate coach.py params with distributed Coach.
* Minor improvements
- Use enums instead of constants.
- Reduce code duplication.
- Ask experiment name with timeout.
2018-11-05 09:33:30 -08:00
Ajay Deshpande
9a30c26469 Adding improvements 2018-10-23 19:59:02 -04:00
Zach Dwiel
517aac163a introduce graph_manager.phase_context; make sure that calls to graph_manager.train automatically set training phase 2018-10-23 16:57:43 -04:00
Ajay Deshpande
fb1039fcb5 Checkpoint and evaluation optimizations 2018-10-23 16:55:37 -04:00
Ajay Deshpande
b285a02023 Adding parameteres, checking transitions before training 2018-10-23 16:55:37 -04:00
Ajay Deshpande
0f46877d7e Adding steps and waiting for new checkpoint 2018-10-23 16:55:37 -04:00
Ajay Deshpande
7f00235ed5 waiting for a new checkpoint if it's available 2018-10-23 16:54:43 -04:00
Balaji Subramaniam
844a5af831 Make distributed coach work end-to-end.
- With data store, memory backend and orchestrator interfaces.
2018-10-23 16:54:43 -04:00
Ajay Deshpande
6b2de6ba6d Adding initial interface for backend and redis pubsub (#19)
* Adding initial interface for backend and redis pubsub

* Addressing comments, adding super in all memories

* Removing distributed experience replay
2018-10-23 16:51:48 -04:00
Zach Dwiel
0812a94fbd first pass at kubernetes 2018-10-23 16:47:46 -04:00
Zach Dwiel
3328b25549 reenable redis; better error message 2018-10-23 16:47:46 -04:00
Zach Dwiel
009cf670f3 fix simple typos; temporarily disable redis in rollout worker 2018-10-23 16:47:46 -04:00
Zach Dwiel
f5b7122d56 weight for checkpoint before trying to start rollout worker 2018-10-23 16:47:46 -04:00
Ajay Deshpande
28926bf2a4 Changing parameters 2018-10-23 16:47:46 -04:00
Ajay Deshpande
c2991819b4 Adding right arguments to the agent 2018-10-23 16:46:04 -04:00
Ajay Deshpande
ce9838a7d6 Adding kubernetes orchestrator for rollouts, adding requirements for incremental docker builds 2018-10-23 16:46:04 -04:00
Zach Dwiel
6541bc76b9 working checkpoints 2018-10-23 16:41:57 -04:00
Zach Dwiel
e34b9ae9cf allow specifying preset as a commandline parameter to rollout worker 2018-10-23 16:40:33 -04:00
Zach Dwiel
bc664c4169 add the first pass of rollout_worker.py 2018-10-23 16:40:33 -04:00