1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-17 19:20:19 +01:00
Files
coach/benchmarks/clipped_ppo
2018-08-13 17:11:34 +03:00
..
2018-08-13 17:11:34 +03:00
2018-08-13 17:11:34 +03:00
2018-08-13 17:11:34 +03:00
2018-08-13 17:11:34 +03:00
2018-08-13 17:11:34 +03:00
2018-08-13 17:11:34 +03:00
2018-08-13 17:11:34 +03:00

Clipped PPO

Each experiment uses 3 seeds and is trained for 10k environment steps. The parameters used for Clipped PPO are the same parameters as described in the original paper.

Inverted Pendulum Clipped PPO - single worker

python3 coach.py -p Mujoco_ClippedPPO -lvl inverted_pendulum
Inverted Pendulum Clipped PPO

Inverted Double Pendulum Clipped PPO - single worker

python3 coach.py -p Mujoco_ClippedPPO -lvl inverted_double_pendulum
Inverted Double Pendulum Clipped PPO

Reacher Clipped PPO - single worker

python3 coach.py -p Mujoco_ClippedPPO -lvl reacher
Reacher Clipped PPO

Hopper Clipped PPO - single worker

python3 coach.py -p Mujoco_ClippedPPO -lvl hopper
Hopper Clipped PPO

Half Cheetah Clipped PPO - single worker

python3 coach.py -p Mujoco_ClippedPPO -lvl half_cheetah
Half Cheetah Clipped PPO

Walker 2D Clipped PPO - single worker

python3 coach.py -p Mujoco_ClippedPPO -lvl walker2d
Walker 2D Clipped PPO

Ant Clipped PPO - single worker

python3 coach.py -p Mujoco_ClippedPPO -lvl ant
Ant Clipped PPO

Swimmer Clipped PPO - single worker

python3 coach.py -p Mujoco_ClippedPPO -lvl swimmer
Swimmer Clipped PPO

Humanoid Clipped PPO - single worker

python3 coach.py -p Mujoco_ClippedPPO -lvl humanoid
Humanoid Clipped PPO