Reproduction codes of Proximal Policy Optimization (PPO) with chainer
This repo is a PPO reproduction codes writen with chainer. See this original paper for details
Choose the params and run below command. The default parameters are set for running in atari environment.
Example:
python3 main.py --env-type='atari'
For the detail of the parameters check the code or type
python3 main.py --help
$ python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/breakout/small/final_model --atari-model-size='small'
result | score |
---|---|
python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/breakout/large/final_model --atari-model-size='large'
result | score |
---|---|
python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/zaxxon/large/final_model --atari-model-size='large' --env='ZaxxonNoFrameskip-v4'
result | score |
---|---|
python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/space_invaders/large/final_model --atari-model-size='large' --env='SpaceInvadersNoFrameskip-v4'
result | score |
---|---|
Sorry in progress...