Reproduction codes of Twin Delayed Deep Deterministic policy gradient (TD3) with chainer
This repo is a TD3 reproduction codes writen with chainer. See this original paper for details
Will be trained with CPU by default
$ python3 main.py --env="walker2d-v2"
May require to export below variable before running the code in linux environment.
$ export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so
$ python3 main.py --test-run --pi-params=trained_results/mujoco/walker2d-v2/pi_final_model
result | score |
---|---|
$ python3 main.py --test-run --pi-params=trained_results/mujoco/ant-v2/pi_final_model
result | score |
---|---|