AgentNet

A lightweight library to build and train neural networks for reinforcement learning using Theano+Lasagne

Warning

The library is halfway through development. We maintain a set of runnable examples, but some parts are still missing and others may change significantly with new versions.

Ubuntu Installation

This far the instalation was only tested on Ubuntu, yet an experienced user is unlikely to have problems installing it onto other Linux or Mac OS Machine Currently the minimal dependencies are bleeding edge Theano and Lasagne.

 sudo apt-get install python-dev python-pip python-nose g++ gfortran liblapack-dev libopenblas-dev git
 sudo pip install virtualenv
 virtualenv agentnet_env
 source agentnet_env/bin/activate
 pip install --upgrade https://github.com/Theano/Theano/archive/master.zip
 pip install --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip
 git clone https://github.com/justheuristic/AgentNet
 cd AgentNet
 python setup.py develop

#####If you wish to get acquainted with the current library state, view some of the ./examples https://github.com/BladeCarrier/AgentNet/blob/master/examples/Agentnet%20tutorial%20-%20boolean%20reasoning%20problem.ipynb

If you wish to join the development, we would be eager to accept your help. Current priority development anchors are maintained at the bottom of this readme.

If you wish to contribute your own architecture or experiment, please contact me via github or justheuristic@gmail.com. In fact, please contact me if you have any questions, or ideas, i'd be eager to see them.

##What

The final framework is planned to be built on and fully compatible with awesome Lasagne[6] with some helper functions to facilitate learning.

The main objectives are:

simple way of tinkering with reinforcement learning architectures
simple experiment design and reproducibility
seamless compatibility with Lasagne and Theano

##Why?

[long story short: create a platform to play with *QN architectures without spending months reading code]

The last several years have marked the rediscovery of neural networks applied to Reinforcement Learning domain. The idea has first been introduced in early 90's [0] or even earlier, but was mostly forgotten soon afterwards.

Years later, these methods were reborn under Deep Learning sauce and popularized by Deepmind [1,2]. Several other researchers have already jumped into the domain with their architectures [3,4] and even dedicated playgrounds [5] to play with them.

The problem is that all these models exist in their own problem setup and implementation bubbles. Simply comparing your new architecture the ones you know requires

10% implementing architecture
20% implementing experiment setup
70% reimplementing all the other network architectures

This process is not only inefficient, but also very unstable, since a single mistake while implementing 'other' architecture can lead to incorrect results.

So here we are, attempting to build yet another bridge between eager researchers [primarily ourselves so far] and deep reinforcement learning.

The key objective is to make it easy to build new architectures and test is against others on a number of problems. The easier it is to reproduce the experiment setup, the simpler it is to architect something new and wonderful, the quicker we get to solutions directly applicable to real world problems.

[0] an dusty old journal issue - https://books.google.ru/books?id=teHhVHk3a54C&printsec=frontcover#v=onepage&q&f=false
[1] DQN by DeepMind - http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html
[2] DQN explained - https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf
[4] deep recurrent - http://arxiv.org/abs/1507.06527
[3] attentive DRQN - http://arxiv.org/pdf/1512.01693.pdf
[5] MazeBaze by Facebook - http://arxiv.org/pdf/1511.07401.pdf

##Current state The library is currently halfway through creation and there is much to be done yet.

[priority] Component

Core components
[done] Environment
[done] Objective
[done] Agent architecture
Experiment platform
- [global] Experiment setup zoo
- [global] Pre-trained model zoo
- [medium] one-row experiment running
Layers
Memory
- Simple RNN done as Lasagne.layers.DenseLayer
- [done] One-step GRU memory
- [medium] LSTM
- [medium] Custom memory layer
Resolvers
- [done] Greedy resolver (as BaseResolver)
- [done] Epsilon-greedy resolver
- [low] Softmax resolver
Q-evaluator
- Supports any lasagne architecture
- [medium] evaluator with learned baseline
Learning objectives algorithms
- [done] Q-learning
- Can use any theano/lasagne expressions for loss, gradients and updates
- [high] Training on interesting sessions pool
- [done] SARSA
- [done] k-step learning
- [done] Actor-critic methods
- [low] policy gradient training
Experiment setups
- [done] boolean reasoning - basic "tutorial" experiment about learning to exploit variable dependencies
- [done] Wikicat - guessing person's traits based on wikipedia biographies
- [high] KSfinder - detecting particle decays in Large Hadron Collider beauty experiment
Visualization tools
- [done] basic monitoring tools
- [medium] generic tunable session visualizer
Explanatory material
[medium] readthedocs pages
[global] MOAR sensible examples

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
agentnet		agentnet
examples		examples
research		research
.gitignore		.gitignore
README.md		README.md
clean.sh		clean.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agentnet

agentnet

examples

examples

research

research

.gitignore

.gitignore

README.md

README.md

clean.sh

clean.sh

setup.py

setup.py

Repository files navigation

AgentNet

Warning

Ubuntu Installation

About

Releases

Packages

Languages

louiekang/AgentNet

Folders and files

Latest commit

History

Repository files navigation

AgentNet

Warning

Ubuntu Installation

About

Resources

Stars

Watchers

Forks

Languages