async_deep_reinforce

Asynchronous deep reinforcement learning

About

An attempt to repdroduce Google Deep Mind's paper "Asynchronous Methods for Deep Reinforcement Learning."

Asynchronous Advantage Actor-Critic (A3C) method for playing "Atari Pong" is now implemented as a test with TensorFlow.

(However the learning result is still not good. I'm now investigating about the problem. Any advice or suggestion is strongly welcomed.)

How to build

First we need to build multi thread ready version of Arcade Learning Enviroment. I made some modification to it to run it on multi thread enviroment.

$ git clone https://github.com/miyosuda/Arcade-Learning-Environment.git
$ cd Arcade-Learning-Environment
$ cmake -DUSE_SDL=ON -DUSE_RLGLUE=OFF -DBUILD_EXAMPLES=ON .
$ make -j 4

$ pip install .

I recommend to install it on VirtualEnv environment.

How to run

To train,

$python a3c.py

To display the result with game play,

$python a3c_disp.py

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
a3c.py		a3c.py
a3c_display.py		a3c_display.py
a3c_training_thread.py		a3c_training_thread.py
a3c_visualize.py		a3c_visualize.py
accum_trainer.py		accum_trainer.py
accum_trainer_test.py		accum_trainer_test.py
ale.cfg		ale.cfg
constants.py		constants.py
game_ac_network.py		game_ac_network.py
game_state.py		game_state.py
game_state_test.py		game_state_test.py
pong.bin		pong.bin
rmsprop_applier.py		rmsprop_applier.py
rmsprop_applier_test.py		rmsprop_applier_test.py

License

amoliu/async_deep_reinforce

Folders and files

Latest commit

History

Repository files navigation

async_deep_reinforce

About

How to build

How to run

About

Resources

License

Stars

Watchers

Forks

Languages