1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-18 11:40:18 +01:00
Files
coach/rl_coach/graph_managers
gouravr 801aed5e10 Changes to avoid memory leak in rollout worker
Currently in rollout worker, we call restore_checkpoint repeatedly to load the latest model in memory. The restore checkpoint functions calls checkpoint_saver. Checkpoint saver uses GlobalVariablesSaver which does not release the references of the previous model variables. This leads to the situation where the memory keeps on growing before crashing the rollout worker.

This change avoid using the checkpoint saver in the rollout worker as I believe it is not needed in this code path.

Also added a test to easily reproduce the issue using CartPole example. We were also seeing this issue with the AWS DeepRacer implementation and the current implementation avoid the memory leak there as well.
2018-12-15 12:26:31 -08:00
..
2018-08-13 17:11:34 +03:00
2018-08-13 17:11:34 +03:00

Block Factory

The block factory is a class which creates a block that fits into a specific RL scheme. Example RL schemes are: self play, multi agent, HRL, basic RL, etc. The block factory should create all the components of the block and return the block scheduler. The block factory will then be used to create different combinations of components. For example, an HRL factory can be later instantiated with:

  • env = Atari Breakout
  • master (top hierarchy level) agent = DDPG
  • slave (bottom hierarchy level) agent = DQN

A custom block factory implementation should look as follows:

class CustomFactory(BlockFactory):
    def __init__(self, custom_params):
        super().__init__()

    def _create_block(self, task_index: int, device=None) -> BlockScheduler:
        """
        Create all the block modules and the block scheduler
        :param task_index: the index of the process on which the worker will be run
        :return: the initialized block scheduler
        """

        # Create env
        # Create composite agents
        # Create level managers
        # Create block scheduler

        return block_scheduler