GitHub - BonsaiAI/modular_rl: Implementation of TRPO and related algorithms

This repository implements several algorithms:

Trust Region Policy Optimization [1]
Proximal Policy Optimization (i.e., TRPO, but using a penalty instead of a constraint on KL divergence), where each subproblem is solved with either SGD or L-BFGS
Cross Entropy Method

TRPO and PPO are implemented with neural-network value functions and use GAE [2].

This library is written in a modular way to allow for sharing code between TRPO and PPO variants, and to write the same code for different kinds of action spaces.

Dependencies:

keras (1.0.1)
theano (0.8.2)
tabulate
numpy
scipy

To run the algorithms implemented here, you should put modular_rl on your PYTHONPATH, or run the scripts (e.g. run_pg.py) from this directory.

Good parameter settings can be found in the experiments directory.

You can learn about the various parameters by running one of the experiment scripts with the -h flag, but providing the (required) env and agent parameters. (Those parameters determine what other parameters are available.) For example, to see the parameters of TRPO,

./run_pg.py --env CartPole-v0 --agent modular_rl.agentzoo.TrpoAgent -h

To the the parameters of CEM,

./run_cem.py --env=Acrobot-v0 --agent=modular_rl.agentzoo.DeterministicAgent  --n_iter=2

[1] JS, S Levine, P Moritz, M Jordan, P Abbeel, "Trust region policy optimization." arXiv preprint arXiv:1502.05477 (2015).

[2] JS, P Moritz, S Levine, M Jordan, P Abbeel, "High-dimensional continuous control using generalized advantage estimation." arXiv preprint arXiv:1506.02438 (2015).

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
experiments		experiments
modular_rl		modular_rl
.coverage		.coverage
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
TODO		TODO
__init__.py		__init__.py
coverage.sh		coverage.sh
lintfiles.txt		lintfiles.txt
pylintrc		pylintrc
run_cem.py		run_cem.py
run_pg.py		run_pg.py
run_pylint.py		run_pylint.py
sim_agent.py		sim_agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiments

experiments

modular_rl

modular_rl

.coverage

.coverage

.gitignore

.gitignore

LICENSE.txt

LICENSE.txt

README.md

README.md

TODO

TODO

init.py

init.py

coverage.sh

coverage.sh

lintfiles.txt

lintfiles.txt

pylintrc

pylintrc

run_cem.py

run_cem.py

run_pg.py

run_pg.py

run_pylint.py

run_pylint.py

sim_agent.py

sim_agent.py

Repository files navigation

About

Releases

Packages

Languages

License

BonsaiAI/modular_rl

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Languages