chainer-td3

Reproduction codes of Twin Delayed Deep Deterministic policy gradient (TD3) with chainer

About

This repo is a TD3 reproduction codes writen with chainer. See this original paper for details

How to train

Will be trained with CPU by default

$ python3 main.py --env="walker2d-v2"

Results

May require to export below variable before running the code in linux environment.

$ export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so

Walker2d-v2

$ python3 main.py --test-run --pi-params=trained_results/mujoco/walker2d-v2/pi_final_model

result	score

Ant-v2

$ python3 main.py --test-run --pi-params=trained_results/mujoco/ant-v2/pi_final_model

result	score

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
models		models
trained_results/mujoco		trained_results/mujoco
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
graph_maker.py		graph_maker.py
main.py		main.py
td3.py		td3.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

trained_results/mujoco

trained_results/mujoco

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

graph_maker.py

graph_maker.py

main.py

main.py

td3.py

td3.py

Repository files navigation

chainer-td3

About

How to train

Results

Walker2d-v2

Ant-v2

About

Releases

Packages

Languages

License

yuishihara/chainer-td3

Folders and files

Latest commit

History

Repository files navigation

chainer-td3

About

How to train

Results

Walker2d-v2

Ant-v2

About

Resources

License

Stars

Watchers

Forks

Languages