Skip to content

louiekang/AgentNet

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgentNet

A lightweight library to build and train neural networks for reinforcement learning using Theano+Lasagne

Warning

The library is halfway through development. We maintain a set of runnable examples, but some parts are still missing and others may change significantly with new versions.

Ubuntu Installation

This far the instalation was only tested on Ubuntu, yet an experienced user is unlikely to have problems installing it onto other Linux or Mac OS Machine Currently the minimal dependencies are bleeding edge Theano and Lasagne.

 sudo apt-get install python-dev python-pip python-nose g++ gfortran liblapack-dev libopenblas-dev git
 sudo pip install virtualenv
 virtualenv agentnet_env
 source agentnet_env/bin/activate
 pip install --upgrade https://github.com/Theano/Theano/archive/master.zip
 pip install --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip
 git clone https://github.com/justheuristic/AgentNet
 cd AgentNet
 python setup.py develop

#####If you wish to get acquainted with the current library state, view some of the ./examples https://github.com/BladeCarrier/AgentNet/blob/master/examples/Agentnet%20tutorial%20-%20boolean%20reasoning%20problem.ipynb

If you wish to join the development, we would be eager to accept your help. Current priority development anchors are maintained at the bottom of this readme.

If you wish to contribute your own architecture or experiment, please contact me via github or justheuristic@gmail.com. In fact, please contact me if you have any questions, or ideas, i'd be eager to see them.

##What

The final framework is planned to be built on and fully compatible with awesome Lasagne[6] with some helper functions to facilitate learning.

The main objectives are:

  • simple way of tinkering with reinforcement learning architectures
  • simple experiment design and reproducibility
  • seamless compatibility with Lasagne and Theano

##Why?

[long story short: create a platform to play with *QN architectures without spending months reading code]

The last several years have marked the rediscovery of neural networks applied to Reinforcement Learning domain. The idea has first been introduced in early 90's [0] or even earlier, but was mostly forgotten soon afterwards.

Years later, these methods were reborn under Deep Learning sauce and popularized by Deepmind [1,2]. Several other researchers have already jumped into the domain with their architectures [3,4] and even dedicated playgrounds [5] to play with them.

The problem is that all these models exist in their own problem setup and implementation bubbles. Simply comparing your new architecture the ones you know requires

  • 10% implementing architecture
  • 20% implementing experiment setup
  • 70% reimplementing all the other network architectures

This process is not only inefficient, but also very unstable, since a single mistake while implementing 'other' architecture can lead to incorrect results.

So here we are, attempting to build yet another bridge between eager researchers [primarily ourselves so far] and deep reinforcement learning.

The key objective is to make it easy to build new architectures and test is against others on a number of problems. The easier it is to reproduce the experiment setup, the simpler it is to architect something new and wonderful, the quicker we get to solutions directly applicable to real world problems.

##Current state The library is currently halfway through creation and there is much to be done yet.

[priority] Component

  • Core components

  • [done] Environment

  • [done] Objective

  • [done] Agent architecture

  • Experiment platform

    • [global] Experiment setup zoo
    • [global] Pre-trained model zoo
    • [medium] one-row experiment running
  • Layers

  • Memory

    • Simple RNN done as Lasagne.layers.DenseLayer
    • [done] One-step GRU memory
    • [medium] LSTM
    • [medium] Custom memory layer
  • Resolvers

    • [done] Greedy resolver (as BaseResolver)
    • [done] Epsilon-greedy resolver
    • [low] Softmax resolver
  • Q-evaluator

    • Supports any lasagne architecture
    • [medium] evaluator with learned baseline
  • Learning objectives algorithms

    • [done] Q-learning
    • Can use any theano/lasagne expressions for loss, gradients and updates
    • [high] Training on interesting sessions pool
    • [done] SARSA
    • [done] k-step learning
    • [done] Actor-critic methods
    • [low] policy gradient training
  • Experiment setups

    • [done] boolean reasoning - basic "tutorial" experiment about learning to exploit variable dependencies
    • [done] Wikicat - guessing person's traits based on wikipedia biographies
    • [high] KSfinder - detecting particle decays in Large Hadron Collider beauty experiment
  • Visualization tools

    • [done] basic monitoring tools
    • [medium] generic tunable session visualizer
  • Explanatory material

  • [medium] readthedocs pages

  • [global] MOAR sensible examples

About

A lightweight library to build and train Deep Reinforcement Learning agents using Theano+Lasagne

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.9%
  • Other 0.1%