Skip to content

toy101/DSAC

 
 

Repository files navigation

Discriminator Soft Actor Critic without Extrinsic Rewards

Paper

TODO

  • continuous action space
  • discrete action space

Install

  • pip install -r requirements.txt

Usage

  • Training Soft Q Imitation Learning (SQIL) and DSAC

    python train_sqil.py [options]

    • --load-demo [dirname] : replay buffer of demonstrations
    • --absorb : with absorbing state wrapper
    • --reward_func : use not constant rewards but generated rewards by a reward function.

    e.g.) DSAC with absorbing state wrapper in AntBulletEnv-v0 (random seed = 1)

    • python train_sqil.py --env AntBulletEnv-v0 --load-demo demos/4_episode/AntBulletEnv-v0 --absorb --reward-func --seed 1

Requirement

python >= 3.7 and please see requirements.txt

If you'd like to use GPU, please pip install cupy-cudaOO

​ In relation to your version of cuda OO, please see the webpage of cupy.

About

The implementation of Discriminator Soft Actor Critic

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%