chainer-bear

Reproduction codes of Bootstrapping Error Accumulation Reduction (BEAR) with chainer

Prerequisites

install chainer, d4rl and tensorboardX in prior of using the code

How to train

with gpu

$ python3 main.py --env="Ant-v2" --datafile=<file to buffer path> --gpu=<gpu number>

without gpu

$ python3 main.py --env="Ant-v2" --datafile=<file to buffer path>

Results

I tested only with Ant-v2 data and found that laplacian kernel is highly stable compared to gaussian kernel.
However, both kernel succeeded learning similar policy that scores like the behavior policy used for gathering the training data.

Below graphs are results of 1 training run for laplacian kernel.

evaluation result

Policy performance suddenly decreses after 200k iterations.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
models		models
pretrained_models/optimal_data/laplacian		pretrained_models/optimal_data/laplacian
trained_results/optimal_data/laplacian		trained_results/optimal_data/laplacian
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bear.py		bear.py
main.py		main.py
wrappers.py		wrappers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

pretrained_models/optimal_data/laplacian

pretrained_models/optimal_data/laplacian

trained_results/optimal_data/laplacian

trained_results/optimal_data/laplacian

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

bear.py

bear.py

main.py

main.py

wrappers.py

wrappers.py

Repository files navigation

chainer-bear

Prerequisites

How to train

with gpu

without gpu

Results

evaluation result

mmd loss

vae loss

About

Releases

Packages

Languages

License

yuishihara/chainer-bear

Folders and files

Latest commit

History

Repository files navigation

chainer-bear

Prerequisites

How to train

with gpu

without gpu

Results

evaluation result

mmd loss

vae loss

About

Resources

License

Stars

Watchers

Forks

Languages