Skip to content

yuishihara/chainer-ppo

Repository files navigation

chainer-ppo

Reproduction codes of Proximal Policy Optimization (PPO) with chainer

About

This repo is a PPO reproduction codes writen with chainer. See this original paper for details

Training the network

Choose the params and run below command. The default parameters are set for running in atari environment.

Example:

python3 main.py --env-type='atari' 

For the detail of the parameters check the code or type

python3 main.py --help

Results

Atari

Breakout

Small model (2 conv layers model)

$ python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/breakout/small/final_model --atari-model-size='small'
result score
breakout_small_result breakout_small_graph

Large model (3 conv layers model)

python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/breakout/large/final_model --atari-model-size='large'
result score
breakout_large_result breakout_large_graph

Zaxxon

Large model (3 conv layers model)

python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/zaxxon/large/final_model --atari-model-size='large' --env='ZaxxonNoFrameskip-v4'
result score
zaxxon_large_result zaxxon_large_graph

Space Invaders

Large model (3 conv layers model)

python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/space_invaders/large/final_model --atari-model-size='large' --env='SpaceInvadersNoFrameskip-v4'
result score
space_invaders_large_result space_invaders_large_graph

Mujoco

Sorry in progress...

About

Reproduction codes of Proximal Policy Optimization (PPO) with chainer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages