DeepRL

This project implements deep reinforcement learning algorithms including following papers.

Deep Q Network (Human-level control through deep reinforcement learning)
Deep Reinforcement Learning with Double Q-learning
Asynchronous Methods for Deep Reinforcement Learning
Prioritized Experience Replay
Continuous control with deep reinforcement learning

Test scores

In my PC (i7 CPU, Titan-X Maxwell),

- A3C FF took 20 hours for 80M global steps (nips network)
- A3C LSTM took 44 hours for 80M global steps (nips network)

- DQN took 96 hours for 80M steps (shown 11M steps, nature network)
- Double-Q took 112 hours for 80M steps (shown 11M steps, nature network)
- Prioritized took 112 hours for 80M steps (shown 11M steps, nature network)

Torcs

After training in simulator Torcs, it learns how to accelerate, brake and turn the steering wheel.
Click the image to watch the video.

Requirements

Python-2.7
pip, scipy, matplotlib, numpy
Tensorflow-0.11
Arcade-Learning-Environment
Torcs (optional)
Vizdoom (in working)

See this for installation.

How to train

DQN         : python train.py /path/to/rom --drl dqn
Double DQN  : python train.py /path/to/rom --drl double_dqn
Prioritized : python train.py /path/to/rom --drl prioritized_rank
A3C FF      : python train.py /path/to/rom --drl a3c --thread-no 8
A3C LSTM    : python train.py /path/to/rom --drl a3c_lstm --thread-no 8
DDPG        : python train.py torcs --ddpg

How to retrain

python train.py /path/to/rom --drl a3c --thread-no 8 --snapshot path/to/snapshot_file
ex) python train.py /rom/breakout.bin --drl a3c --thread-no 8 --snapshot snapshot/breakout/20161114_003838/a3c_6250000

How to play

python play.py path/to/snapshot_file
ex) python play.py snapshot/space_invaders/20161114_003838/a3c_79993828

Debug console commands

While training you can send several debug commands in the console.

p : print debug logs or not
u : pause training or not
quit : finish running
d : show the current running screen or not. You can see how the training is going on in the game screen.
- : show the screen more fast
+ : show the screen more slowly

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
env		env
network_model		network_model
snapshot		snapshot
test		test
.gitignore		.gitignore
INSTALL.md		INSTALL.md
README.md		README.md
play.py		play.py
replay_memory.py		replay_memory.py
sampling_manager.py		sampling_manager.py
train.py		train.py
util.py		util.py

futurecrew/DeepRL

Folders and files

Latest commit

History

Repository files navigation

DeepRL

Test scores

Torcs

Requirements

How to train

How to retrain

How to play

Debug console commands

Reference projects

About

Resources

Stars

Watchers

Forks

Languages