hobotrl

Common RL framework and utilities.

Design considerations

maximize module reuse under RL framework
accumulate experiment baselines for algorithm, with hyperparam setup, easily reproducible
accumulate experiment baselines for problems, with algorithm/hyperparam/engineering efforts
easily implement more algorithms
easily combine different works

Initial algorithm set

[v] DQN
[v] DDPG
[v] Replay Buffer
[v] Prioritized Exp Replay
[v] Double DQN
[v] Duel DQN
[v] Actor Critic
[v] Optimality Tightening
[v] A3C
[v] PPO
[v] Bootstrap DQN
[v] ICM
[v] I2A
[v] Soft Q Learning

Getting started

Installing

pip install -e .

So you can use algorithms elsewhere.

Running Experiments

Running different experiments may require different libraries, such as opencv-python, gym[box2d], or roboschool.

python test/exp_tabular.py run --name TabularGrid

for starter.

python test/exp_tabular.py list
python test/exp_deeprl.py list

to get a list of experiments in each experiment file.

. scripts/a3c_pong.sh

to start processes to run a3c algorithm.

Running Unit Tests

python -m unittest discover -s hobotrl -p "test*.py" -v

to run all unit test cases.

Usage

>>> import hobotrl as hrl
>>> dir(hrl)

to see what's inside.

Typically most widely used classes are imported in module hobotrl, like DQN, DPG, ActorCritic:

Basic Algorithms

>>> help(hrl.DQN)

>>> help(hrl.DPG)

>>> help(hrl.ActorCritic)

to consult help doc. Also remember to check out experiment files and unit tests as a reference.

Distributed Algorithms

In hobotrl, distributed training is implemented with Tensorflow's cluster capability.

See bash scripts in scripts folder for starting worker and ps processes for distributed training.

Driving Simulator Environment

The steps for starting the driving simulator environment:

Open up a new shell, exececute roscore to launch ROS master.
Open up yet another shell, first source [catkin_ws_dir]/devel/setup.bash to register simulator ROS packages, then run python rviz_restart.py to fire up the simulator launcher.
The last shell if for running the actual main script, where a DrivingSimulatorEnv is instanced to commnunicate with the previously opened nodes as well as the agent.

Note these steps are tentitive and subject to change.

Developers Guide

Sharing network parameters across modules

~~See this wiki entry for a recommended way via global variable scope reuse.~~ [Setting global scope reference will break the creation of target network.]

Name		Name	Last commit message	Last commit date
Latest commit History 1,481 Commits
hobotrl		hobotrl
playground		playground
scripts		scripts
test		test
tools		tools
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hobotrl

hobotrl

playground

playground

scripts

scripts

test

test

tools

tools

.gitignore

.gitignore

README.md

README.md

setup.py

setup.py

Repository files navigation

hobotrl

Design considerations

Initial algorithm set

Getting started

Installing

Running Experiments

Running Unit Tests

Usage

Basic Algorithms

Distributed Algorithms

Driving Simulator Environment

Developers Guide

Sharing network parameters across modules

About

Releases

Packages

Contributors 7

Languages

hobotrl/hobotrl

Folders and files

Latest commit

History

Repository files navigation

hobotrl

Design considerations

Initial algorithm set

Getting started

Installing

Running Experiments

Running Unit Tests

Usage

Basic Algorithms

Distributed Algorithms

Driving Simulator Environment

Developers Guide

Sharing network parameters across modules

About

Topics

Resources

Stars

Watchers

Forks

Languages