GitHub - pshvechikov/tinyverse: Universe RL trainer platform. Simple. Supple. Scalable.

Universe RL trainer platform. Simple. Supple. Scalable.

Why should i care?

tinyverse is a reinforcement learning platform for gym/universe/custom environments that lets you utilize any resources you have to train reinforcement learning algorithm.

Key features

Simple: the core is currently under 400 lines including code (~50%), comments(~40%) and spaces (~10%).
Supple: tinyverse assumes almost nothing of your agent and environment. The environment may not be interruptable. Agent may have any algorithm/structure. Agent [will soon](yandexdataschool#14) support any framework from numpy to pure tensorflow/theano to keras/lasagne+agentnet.
Scalable: You can train and play 10 parallel games on your GPU desktop/server, 20 more sessions on your Macbook and another 5 on your friend's laptop when he doesn't look. (And 1000 more games and 10 trainers in the cloud ofc).

The core idea is to have two types of processes:

play-er - interacts with the environment, records sessions to the database, periodically loads new params
train-er - reads sessions from the database, trains agent via experience replay, sends params to the database

Those processes revolve around database that stores experience sessions and weights. The database is currently implemented with Redis since it is simple to setup and swift with key-value operations. You can, however, implement the database interface with what database you prefer.

Quickstart

install redis server

(Ubuntu) sudo apt-get install redis-server
Mac OS version HERE.
Otherwise search "Install redis your_OS" or ask on gitter.
If you want to run on multiple machines, configure redis-server to listen to 0.0.0.0 (also mb set password)

install python packages

gym and universe
- pip install gym[atari]
- pip install universe - most likely needs dependencies, see urls above.
install bleeding edge theano, lasagne and agentnet for agentnet examples to work.
- Preferably setup theano to use floatX=float32 in .theanorc
pip install joblib redis prefetch_generator six
examples require opencv: conda install -y -c https://conda.binstar.org/menpo opencv3

Spawn several player processes. Each process simply interacts and saves results. -b stands for batch size.

for i in `seq 1 10`; 
do
        python tinyverse atari.py play -b 3 &
done

Spawn trainer process. (demo below runs on gpu, change to cpu if you have to) THEANO_FLAGS=device=gpu python tinyverse atari.py train -b 10 &
evaluate results at any time (records video to ./records) python tinyverse atari.py eval -n 5

Devs: see workbench.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
tinyverse		tinyverse
.gitignore		.gitignore
.travis		.travis
README.md		README.md
__init__.py		__init__.py
atari.py		atari.py
atari_a3c_weights.7h_train.pcl		atari_a3c_weights.7h_train.pcl
neonrace.ipynb		neonrace.ipynb
neonrace.py		neonrace.py
workbench.ipynb		workbench.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tinyverse

tinyverse

.gitignore

.gitignore

.travis

.travis

README.md

README.md

init.py

init.py

atari.py

atari.py

atari_a3c_weights.7h_train.pcl

atari_a3c_weights.7h_train.pcl

neonrace.ipynb

neonrace.ipynb

neonrace.py

neonrace.py

workbench.ipynb

workbench.ipynb

Repository files navigation

Why should i care?

Key features

Quickstart

About

Releases

Packages

Languages

pshvechikov/tinyverse

Folders and files

Latest commit

History

Repository files navigation

Why should i care?

Key features

Quickstart

About

Resources

Stars

Watchers

Forks

Languages