Modular implementation of Vanila Policy Gradient (VPG) algorithm with an RNN policy.
- Python 2.7 or 3.5
- TensorFlow 1.10
- gym
- numpy
- tqdm progress-bar
- Using an RNN policy for giving the action probabilities for a reinforcement learning problem
- Using a sampler that reshape the trajectory to be feed into an RNN policy
- Using gradient clipping to solve the exploding gradient problem
- Using GRU to solve the vanishing gradient problem
To train a model for Cartpole-v0:
$ python run_pg_rnn.py
To view the tensorboard
$tensorboard --logdir .