mirror of
https://github.com/gryf/coach.git
synced 2025-12-18 03:30:19 +01:00
49 lines
952 B
Markdown
49 lines
952 B
Markdown
# Twin Delayed DDPG
|
|
|
|
Each experiment uses 5 seeds and is trained for 1M environment steps.
|
|
The parameters used for TD3 are the same parameters as described in the [original paper](https://arxiv.org/pdf/1802.09477.pdf), and [repository](https://github.com/sfujim/TD3).
|
|
|
|
### Ant TD3 - single worker
|
|
|
|
```bash
|
|
coach -p Mujoco_TD3 -lvl ant
|
|
```
|
|
|
|
<img src="ant.png" alt="Ant TD3" width="800"/>
|
|
|
|
|
|
### Hopper TD3 - single worker
|
|
|
|
```bash
|
|
coach -p Mujoco_TD3 -lvl hopper
|
|
```
|
|
|
|
<img src="hopper.png" alt="Hopper TD3" width="800"/>
|
|
|
|
|
|
### Half Cheetah TD3 - single worker
|
|
|
|
```bash
|
|
coach -p Mujoco_TD3 -lvl half_cheetah
|
|
```
|
|
|
|
<img src="half_cheetah.png" alt="Half Cheetah TD3" width="800"/>
|
|
|
|
|
|
### Reacher TD3 - single worker
|
|
|
|
```bash
|
|
coach -p Mujoco_TD3 -lvl reacher
|
|
```
|
|
|
|
<img src="reacher.png" alt="Reacher TD3" width="800"/>
|
|
|
|
|
|
### Walker2D TD3 - single worker
|
|
|
|
```bash
|
|
coach -p Mujoco_TD3 -lvl walker2d
|
|
```
|
|
|
|
<img src="walker2d.png" alt="Walker2D TD3" width="800"/>
|