Deep Q-Learning

Overview

Our version of the deep q-learning algorithm from The DQN paper. This algorithm reads the screen and the integer score of the Atari 2600 game Space Invaders. The output is the same control commands as a human would have with a controller (albeit, without the physical controller).

Installation Dependencies:

Python 2.7
Theano
Lasagne
pygame
Arcade Learning Environment (ALE) 0.5.1
Atari 2600 ROM of space_invaders.bin

Amazon Instance Installation

Look at /provision/aws_installation.sh for a concise shell history to install the environment.

External References

The DQN paper

Human-level control through deep reinforcement learning

Deep Reinforcement Learning with Double Q-learning - more stable learning through double q-learning

Action-Conditional Video Prediction using Deep Networks in Atari Games - predicting future frames

Dueling Network Architectures for Deep Q-learning

Arcade Learning Environment

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

Reccurent Model of Visual Attention - applying q-learning to figure out what part of the image to look at.

Prioritized Experience Replay - drawing from the memory should be more likely if the memory is more shocking

Deep Recurrent Q-Learning For Partially Observable MDPs - by using LSTM you can get rid of preprocessing done in DQN paper. "The recurrent net can better adapt at evaluation time if the quality of observations changes"

A fast learning algorithm for deep belief nets - Training one layer at a time

Reinforcement Learning and Automated Planning: A Survey

Autoregressive Neural Networks - Neural Networks applied to Time Series.

Deep Autoregressive Neural Networks - predicting future frames of an Atari Game.

Reinforcement Learning: An introduction - very thorough introduction to Reinforcement Learning.

A survey of robot learning by demonstration Learning by|from demonstration = Learning by watching = Learning from observation = Programming by demonstration = Behaviour cloning|imitation|mimicry

DynaQ

Deep Reinforcement Learning Nice summary of recent advances in Deep Q-learning.

Concurrent Q-learning for Autonomous Mapping and Navigation One-trial learning???

Using Reinforcement Learning to Adapt an Imitation Task Overcoming new obstacles ???

On the importance of initialization and momentum in deep learning - Nesterov Momentum vs Nesterov Accelerated Gradient

CNN Features off-the-shelf: an Astounding Baseline for Recognition NN generated features are better then manually-made

Prioritized Experience Replay - on Atari games

Network in Network - MaxPooling looses information, let's keep some more information.

Concurrent Reinforcement Learning - RL in time dependent environments

Name		Name	Last commit message	Last commit date
Latest commit History 148 Commits
analysis		analysis
provision		provision
runs		runs
test		test
video		video
.gitignore		.gitignore
README.md		README.md
ale_game.py		ale_game.py
analyze_layers.py		analyze_layers.py
dqn.py		dqn.py
network.py		network.py
run.py		run.py
run_dev.py		run_dev.py
run_play.py		run_play.py
run_simple_breakout.py		run_simple_breakout.py
simple_breakout.py		simple_breakout.py
teacher.py		teacher.py
updates.py		updates.py

wh-forker/deep-q-learning

Folders and files

Latest commit

History

Repository files navigation

Deep Q-Learning

Overview

Installation Dependencies:

Amazon Instance Installation

External References

About

Resources

Stars

Watchers

Forks

Languages