Imitation Learning via Kernel Mean Embedding

Kee-Eung Kim and Hyun Soo Park

This code is used for the paper Imitation Learning via Kernel Mean Embedding.

The implementation is based on Jonathan Ho's GAIL (Generative Adversarial Imitation Learning) code.

Contains an implementation of Trust Region Policy Optimization (Schulman et al., 2015) and Generative Adversarial Imitation Learning (Jonathan et al., 2016).

Dependencies:

Python 2.7
OpenAI Gym >= 0.1.0, mujoco_py >= 0.4.0
numpy >= 1.10.4, scipy >= 0.17.0, theano >= 0.8.2
h5py, pytables, pandas, matplotlib

Provided files:

expert_policies/* are the expert policies, trained by TRPO (scripts/run_rl_mj.py) on the true costs
scripts/im_pipeline.py is the main training and evaluation pipeline. This script is responsible for sampling data from experts to generate training data, running the training code (scripts/imitate_mj.py), and evaluating the resulting policies.
pipelines/* are the experiment specifications provided to scripts/im_pipeline.py
results/* contain evaluation data for the learned policies

Hyperparameters:

In order to set hyperparameters, pass arguments to python script scripts/imitate_mj.py. For example, run python scripts/imitate_mj.py --mode gmmil --reward_type mmd --data EXPERT_TRAJ_PATH --env_name ENV_NAME. For more detail information, check the example shell file train.sh.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
classic_policies		classic_policies
environments		environments
expert_policies		expert_policies
pipelines		pipelines
policyopt		policyopt
results		results
scripts		scripts
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

classic_policies

classic_policies

environments

environments

expert_policies

expert_policies

pipelines

pipelines

policyopt

policyopt

results

results

scripts

scripts

.gitignore

.gitignore

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

README.md

README.md

train.sh

train.sh

Repository files navigation

Imitation Learning via Kernel Mean Embedding

Kee-Eung Kim and Hyun Soo Park

About

Releases 1

Packages

Languages

License

xairc/gmmil

Folders and files

Latest commit

History

Repository files navigation

Imitation Learning via Kernel Mean Embedding

Kee-Eung Kim and Hyun Soo Park

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages