Skip to content

musyoku/deep-q-network

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This package provides a Chainer implementation of Deep Q-Network(DQN) described in the following papers:

この記事で実装したコードです。

Requirements

環境構築に関しては DQN-chainerリポジトリを動かすだけ が参考になります。

Running

e.g. Atari Breakout

Open 4 terminal windows and run the following commands on each terminal:

Terminal #1

rl_glue

Terminal #2

cd path_to_deep-q-network
python experiment.py --csv_dir breakout/csv --plot_dir breakout/plot

Terminal #3

cd path_to_deep-q-network/breakout
python train.py

Terminal #4

cd /home/your_name/ALE
./ale -game_controller rlglue -use_starting_actions true -random_seed time -display_screen true -frame_skip 4 -send_rgb true /path_to_rom/breakout.bin

Experiments

実験に用いたコンピュータのスペックは以下の通りです。

OS Ubuntu 14.04 LTS
CPU Core i7
RAM 16GB
GPU GTX 970M 6GB

Atari Breakout

Breakout

Preprocessing

We extract the luminance from the RGB frame and rescale it to 84x84.

Then we stack the 4 most recent frames to produce the input to the DQN.

e.g.

frame-0 frame-1 frame-2 frame-3

Training

We trained DQN for a total of 42 hours (8200 episodes, 93 epochs, 4670K frames).

本当は10000プレイさせたかったのですが突然コンピュータがシャットダウンしてしまったため中途半端な結果になってしましました。

Score:

Breakout episode-score

Highscore:

Breakout episode-highscore

Evaluation

Average score:

Breakout episode-average

Atari Pong

Pong

Preprocessing

frame-0 frame-1 frame-2 frame-3

Training

We trained DQN for a total of 50 hours (1500 episodes, 96 epochs, 4849K frames).

Score:

Pong episode-score

Highscore:

Pong episode-highscore

Evaluation

Average score:

Pong episode-average

About

Chainer implementation of Deep Q-Network(DQN)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages