Implementation of Guided Policy Search (GPS) on Hopping Bot.
This repo has been public for educational purposes and contribution are welcomed. I have stopped working on this, but if someone else wants to take this up and wants to collabarate with me, you are more than welcomed. Below have I provided a glimpse of all the TODO tasks.
It goes without saying that I would like to offer my thanks to Chelesa Finn (professor at Stanford) and S. Levine (professor at UC Berkley) (and many others who contributed in developing this) for making their code an open source and making their work on GPS public. I would like offer my thanks for their generous contribution to the field of Reinforcement Learning.
This code is unstable and needs lots of polishing. Many parts of the code are still under development. Currently, I am working out the math required and re-deriving all the important equations of the GPS (please refer to S. Levine thesis on learning motor skills).
Following people must be cited for their online/open source work iLQR, GPS, OpenAI. Also commerical softwares like MUJOCO.
Most of the code was written by taking inspiration from the original publishers but I have added my flavour. I have trimmed code for our purposes or I have added other functionalities. Please cite me if you are using this repository: Author: Sameer Kumar; Date: May 17th 2019; Title: GPS on Hopping task; Designation and School: Phd student in Texas A&M
. That date corresponds to when I made this code public. Also for contact information you can refer to my website.
- Update:
sudo apt-get update
- Jupyter Notebook: Run
python3 -m pip install --upgrade pip
and thenpython3 -m pip install jupyter
- Required Dependencies: Run
pip3 install -r requirement.txt
- Scikit-Learn:
pip3 install -U scikit-learn
- iLQR module: Go to ilqr-master_new and run
python3 setup.py install
- GPU Drivers: To install drivers for Nvidia which are needed for running GPU follow the instructions in the following link. Check if drivers are connected and they are responding by running
nvidia-smi
. This may give lot troubles hence be patient but this is the most hardest part of installation, after this is done you are all set. If you can't install tensorflow-gpu then just use tensorlfow (cpu version), it should be fine. To install tensorflow (cpu) remove the tensorflow-gpu which should have been installed byrequirement.txt
. For this runpip3 uninstall tensorflow-gpu
and then runpip3 install tensorflow
. - MUJOCO: To install go to following link. There you can install student license version of this software. This may take time please by patient, but should be easier than above.
- Update and reboot (not necessary but recommened):
sudo apt-get update && sudo reboot
. - Finally run
TrajGenerator-V2.ipynb
- Bodies: Torso, Thigh, Leg, Foot (kinematic chain in order as per xml file)
- Torso: X: Slider, Y: Hinge, Z: Slider i.e, we have only linear movement in X, Z direction. And we have rotation in Y direction.
- Thigh: Hinge joint in Y axis. Angle limit [-150, 0] degrees, Fricition 0.9.
- Leg: Hinge joint in Y axis. Angle limit [-150, 0] degrees, Fricition 0.9.
- Foot: Hinge joint in Y axis. Angle limit [-150, 0] degrees, Fricition 2.0.
- States: X = [ZPos, XPos, YPos, YDeg, YDeg, YDeg] in the order of kinematic links. Velocities will also be in the same order.
- Algorithm_BADMM.py
- Trajectory_Opt_LQR.py
- Algoithm.py
- Hyperparams.py
- Agent.py
- Agent_Utils.py
- Sample.py
- Sample_List.py
- Agent_Config.py
- Cost_Sum.py
- Cost_State.py
- Cost_Action.py
- Cost_Utils.py
- Config_ALG_BADMM.py
- Config_Traj_Opt.py
- Algorithm_Utils.py
- Traj_Opt_Utils.py
- Linear_Gauss_Policy.py
- GPS_General_Utils.py
- Policy.py
- GMM.py
- Add fully working
GMM.py
. - Test MUJOCO Cart Pole and write function to extract states, send control and to control rendering.
- Figure out what does the following function do
class BundleType
. - Figure out what does the following function do
func extract_condition
. - Add function to calculate nominal trajectories using state dynamic matrices given by GMM.
- Install TensorFlow GPU in both Python 2 and Python 3.
- Install MUJOCO on PC.
- In GitHub add result folder and simulator results.
- Write
Agent.py
files. - Recheck the files required for Trajectory Optimization.
- Add function to calculate the nominal trajectory (i.e. iLQR optimized) using state dynamic matrices given by GMM.
- Read more about Agent.py file. What is the function of this file? Where it is called?
- Write the
Cost.py
files. Look how cost functions are written and how they are generalized? - Write
Hyperparameter.py
files. - Understand how
Traj_Opt_Utils.py
computes KL-Divergence. - Do we need noise patterns in the
LinearGaussianPolicy.py
file? - Figure out what is Policy GMM and why are they are using it? Where does the code for Policy go? Where it is called?
- Figure out what these files do:
gps.py
(main file). - Figure out what these files do:
PolicyOptCaffe.py
(then modify it into tensorflow based policy). - Figure out what these files do:
gps.gui.config.py
(Rendering plus GUI can we make our code without using this?). - Figure out what these files do:
gps.proto.gps_pb2.py
(can we make our code without using this?). - Add information regarding CartPole simulator in Readme file.