GitHub - mihahauke/dqn_vizdoom_theano: AI learning from visual input using ViZDoom environment.

AI learning from raw visual input using ViZDoom environment with Theano and Lasagne.

The code implements Double DQN with Duelling architecture:

Some videos with early results (no double/duelling and bugs): https://www.youtube.com/watch?v=re6hkcTWVUY

Requirements:

Python 2.7 (should work with 3 after correcting prints)
ViZDoom
Scikit-image
Theano
Lasagne
tqdm (optional)

The code requires vizdoom.so and vizdoom to be present in the root directory. Config files and scenarios are also needed (can be found in the ViZDoom repo).

Usage of the learning script

usage: learn.py [-h] [--load-agent <AGENT_FILE>] [--list]
                [--load-json <JSON_FILE>] [--config-file <CONFIG_FILE>]
                [--name <NAME>] [--no-save] [--no-save-results]
                [--no-save-best] [--epochs <EPOCHS_NUM>]
                [--train-steps <TRAIN_STEPS>]
                [--test-episodes <TEST_EPISODES_NUM>] [--no-tqdm]
                [agent]

Learning script for ViZDoom.

positional arguments:
  agent                 agent function name from agents.py

optional arguments:
  -h, --help            show this help message and exit
  --load-agent <AGENT_FILE>, -l <AGENT_FILE>
                        load agent from a file
  --list                lists agents available in agents.py
  --load-json <JSON_FILE>, -j <JSON_FILE>
                        load agent's specification from a json file
  --config-file <CONFIG_FILE>, -c <CONFIG_FILE>
                        configuration file (used only when loading agent or
                        using json)
  --name <NAME>, -n <NAME>
                        agent's name (affects savefiles)
  --no-save             do not save agent's parameters
  --no-save-results     do not save agent's results
  --no-save-best        do not save the best agent
  --epochs <EPOCHS_NUM>, -e <EPOCHS_NUM>
                        number of epochs (default: infinity)
  --train-steps <TRAIN_STEPS>
                        training steps per epoch (default: 200k)
  --test-episodes <TEST_EPISODES_NUM>
                        testing episodes per epoch (default: 300)
  --no-tqdm             do not use tqdm progress bar

Usage of the script for watching:

usage: watch.py [-h] [--config-file [config_file]] [--episodes [episodes]]
                [--no-watch] [--action-sleep [action_sleep]]
                [--episode-sleep [episode_sleep]]
                [agent_file]

A script to watch agents play or test them.

positional arguments:
  agent_file            file with the agent

optional arguments:
  -h, --help            show this help message and exit
  --config-file [config_file], -c [config_file]
                        override agent's configuration file
  --episodes [episodes], -e [episodes]
                        run this many episodes (default 20)
  --no-watch            do not display the window and do not sleep
  --action-sleep [action_sleep], -s [action_sleep]
                        sleep this many seconds after each action
                        (default=1/35.0)
  --episode-sleep [episode_sleep]
                        sleep this many seconds after each episode
                        (default=0.5)

Usage of the plotting script:

usage: plot_results.py [-h] [--stats <STAT> [<STAT> ...] | --list |
                       --x-resolution <X_RESOLUTION>]
                       files [files ...]

This scprit plots results generated by learn.py.

positional arguments:
  files                 file(s) with results

optional arguments:
  -h, --help            show this help message and exit
  --stats <STAT> [<STAT> ...], -s <STAT> [<STAT> ...]
                        plot fiven stats e.g. mean, train_mean, std ...
  --list                lists available stats for all files and exit
  --x-resolution <X_RESOLUTION>, -r <X_RESOLUTION>
                        interval for x axis in number of training actions
                        (default: 1000000)

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
agents		agents
.gitignore		.gitignore
README.md		README.md
agents.py		agents.py
approximators.py		approximators.py
args_parser.py		args_parser.py
bank_test.py		bank_test.py
common.cfg		common.cfg
learn.py		learn.py
multi_results.py		multi_results.py
play.py		play.py
plot_results.py		plot_results.py
print_params.py		print_params.py
qengine.py		qengine.py
replay_memory.py		replay_memory.py
run_on_slurm.sh		run_on_slurm.sh
show_filters.py		show_filters.py
util.py		util.py
watch.py		watch.py

mihahauke/dqn_vizdoom_theano

Folders and files

Latest commit

History

Repository files navigation

Requirements:

Usage of the learning script

Usage of the script for watching:

Usage of the plotting script:

About

Resources

Stars

Watchers

Forks

Languages