Skip to content

futurecrew/DeepRL

Repository files navigation

DeepRL

This project implements deep reinforcement learning algorithms including following papers.

  • Deep Q Network (Human-level control through deep reinforcement learning)
  • Deep Reinforcement Learning with Double Q-learning
  • Asynchronous Methods for Deep Reinforcement Learning
  • Prioritized Experience Replay
  • Continuous control with deep reinforcement learning

Test scores

In my PC (i7 CPU, Titan-X Maxwell),


- A3C FF took 20 hours for 80M global steps (nips network)
- A3C LSTM took 44 hours for 80M global steps (nips network)


- DQN took 96 hours for 80M steps (shown 11M steps, nature network)
- Double-Q took 112 hours for 80M steps (shown 11M steps, nature network)
- Prioritized took 112 hours for 80M steps (shown 11M steps, nature network)

Torcs

After training in simulator Torcs, it learns how to accelerate, brake and turn the steering wheel.
Click the image to watch the video.

Requirements

  • Python-2.7
  • pip, scipy, matplotlib, numpy
  • Tensorflow-0.11
  • Arcade-Learning-Environment
  • Torcs (optional)
  • Vizdoom (in working)

    See this for installation.

How to train

DQN         : python train.py /path/to/rom --drl dqn
Double DQN  : python train.py /path/to/rom --drl double_dqn
Prioritized : python train.py /path/to/rom --drl prioritized_rank
A3C FF      : python train.py /path/to/rom --drl a3c --thread-no 8
A3C LSTM    : python train.py /path/to/rom --drl a3c_lstm --thread-no 8
DDPG        : python train.py torcs --ddpg

How to retrain

python train.py /path/to/rom --drl a3c --thread-no 8 --snapshot path/to/snapshot_file
ex) python train.py /rom/breakout.bin --drl a3c --thread-no 8 --snapshot snapshot/breakout/20161114_003838/a3c_6250000

How to play

python play.py path/to/snapshot_file
ex) python play.py snapshot/space_invaders/20161114_003838/a3c_79993828

Debug console commands

While training you can send several debug commands in the console.

  • p : print debug logs or not
  • u : pause training or not
  • quit : finish running
  • d : show the current running screen or not. You can see how the training is going on in the game screen.
  • - : show the screen more fast
  • + : show the screen more slowly

Reference projects

About

Deep Reinforcement Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages