PyTorch CartPole Example

Modified from https://github.com/g6ling/Reinforcement-Learning-Pytorch-Cartpole.

How to use this repo:

Make sure that you have installed miniconda (for Linux, see https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html)
cd into this repo
cd packages
conda create --name pomdpr python=3.7 where pomdpr stands for POMDP and Robotics
conda activate pomdpr
chmod +x install_packages.sh where chmod makes the bash script executable on your device
./install_packages.sh installs numpy scipy torch gym PyYAML wandb and rl_parsers and gym-pomdp (these two are stored inside drqn/packages)
cd .. back to the top level of the repo
Test your installation using python algorithms/POMDP/3-DRQN-Store-State-HeavenHell/train.py --lr=0.00005 --use_experts=0 --seed=1 --debug_mode=1 where debug_mode=1 makes sure that wandb is not used
wandb login
Do anything you want now.

Note that if you modify the Heaven-Hell pomdp file (e.g., modify the initial belief or the starting state distribution) you will need to re-install gym-pomdp for the change to take effect.

Below is README from the original repo.

PyTorch CartPole Example

Simple Cartpole example writed with pytorch.

Why Cartpole?

Cartpole is very easy problem and is converged very fast in many case. So you can run this example in your computer(maybe it take just only 1~2 minitue).

Rainbow

PG(Policy Gradient)

Parallel

Distributional DQN

QRDQN [18]
IQN [19]

Exploration

ICM [22]
RND [17]

POMDP (With RNN)

Reference

Acknowledgements

Use Cuda

check this issue. g6ling/Reinforcement-Learning-Pytorch-Cartpole#1

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
algorithms		algorithms
demos		demos
domains		domains
domains_conf		domains_conf
packages		packages
.gitignore		.gitignore
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
test_domain.py		test_domain.py
test_wandb.py		test_wandb.py

License

zhihanyang2022/drqn

Folders and files

Latest commit

History

Repository files navigation

PyTorch CartPole Example

Why Cartpole?

Rainbow

PG(Policy Gradient)

Parallel

Distributional DQN

Exploration

POMDP (With RNN)

Reference

Acknowledgements

Use Cuda

About

Resources

License

Stars

Watchers

Forks

Languages