Skip to content

Latest commit

 

History

History

Soft Actor Critic

Each experiment uses 3 seeds and is trained for 3M environment steps. The parameters used for SAC are the same parameters as described in the original paper.

Inverted Pendulum SAC - single worker

coach -p Mujoco_SAC -lvl inverted_pendulum

Inverted Pendulum SAC

Hopper Clipped SAC - single worker

coach -p Mujoco_SAC -lvl hopper

Hopper SAC

Half Cheetah Clipped SAC - single worker

coach -p Mujoco_SAC -lvl half_cheetah

Half Cheetah SAC

Walker 2D Clipped SAC - single worker

coach -p Mujoco_SAC -lvl walker2d

Walker 2D SAC

Humanoid Clipped SAC - single worker

coach -p Mujoco_SAC -lvl humanoid

Humanoid SAC