Skip to content

Guided Policy Search on Hopping task

License

Notifications You must be signed in to change notification settings

guoyaq/Hopping_Bot

 
 

Repository files navigation

Hoppingbot

Implementation of Guided Policy Search (GPS) on Hopping Bot.

Unstable

This repo has been public for educational purposes and contribution are welcomed. I have stopped working on this, but if someone else wants to take this up and wants to collabarate with me, you are more than welcomed. Below have I provided a glimpse of all the TODO tasks.

It goes without saying that I would like to offer my thanks to Chelesa Finn (professor at Stanford) and S. Levine (professor at UC Berkley) (and many others who contributed in developing this) for making their code an open source and making their work on GPS public. I would like offer my thanks for their generous contribution to the field of Reinforcement Learning.

This code is unstable and needs lots of polishing. Many parts of the code are still under development. Currently, I am working out the math required and re-deriving all the important equations of the GPS (please refer to S. Levine thesis on learning motor skills).

Cite

Following people must be cited for their online/open source work iLQR, GPS, OpenAI. Also commerical softwares like MUJOCO.

Most of the code was written by taking inspiration from the original publishers but I have added my flavour. I have trimmed code for our purposes or I have added other functionalities. Please cite me if you are using this repository: Author: Sameer Kumar; Date: May 17th 2019; Title: GPS on Hopping task; Designation and School: Phd student in Texas A&M. That date corresponds to when I made this code public. Also for contact information you can refer to my website.

How to Install

  • Update: sudo apt-get update
  • Jupyter Notebook: Run python3 -m pip install --upgrade pip and then python3 -m pip install jupyter
  • Required Dependencies: Run pip3 install -r requirement.txt
  • Scikit-Learn: pip3 install -U scikit-learn
  • iLQR module: Go to ilqr-master_new and run python3 setup.py install
  • GPU Drivers: To install drivers for Nvidia which are needed for running GPU follow the instructions in the following link. Check if drivers are connected and they are responding by running nvidia-smi. This may give lot troubles hence be patient but this is the most hardest part of installation, after this is done you are all set. If you can't install tensorflow-gpu then just use tensorlfow (cpu version), it should be fine. To install tensorflow (cpu) remove the tensorflow-gpu which should have been installed by requirement.txt. For this run pip3 uninstall tensorflow-gpu and then run pip3 install tensorflow.
  • MUJOCO: To install go to following link. There you can install student license version of this software. This may take time please by patient, but should be easier than above.
  • Update and reboot (not necessary but recommened): sudo apt-get update && sudo reboot.
  • Finally run TrajGenerator-V2.ipynb

Information regarding Hopper:

Hopper

  • Bodies: Torso, Thigh, Leg, Foot (kinematic chain in order as per xml file)
  • Torso: X: Slider, Y: Hinge, Z: Slider i.e, we have only linear movement in X, Z direction. And we have rotation in Y direction.
  • Thigh: Hinge joint in Y axis. Angle limit [-150, 0] degrees, Fricition 0.9.
  • Leg: Hinge joint in Y axis. Angle limit [-150, 0] degrees, Fricition 0.9.
  • Foot: Hinge joint in Y axis. Angle limit [-150, 0] degrees, Fricition 2.0.
  • States: X = [ZPos, XPos, YPos, YDeg, YDeg, YDeg] in the order of kinematic links. Velocities will also be in the same order.

Files to add:

  • Algorithm_BADMM.py
  • Trajectory_Opt_LQR.py
  • Algoithm.py
  • Hyperparams.py
  • Agent.py
  • Agent_Utils.py
  • Sample.py
  • Sample_List.py
  • Agent_Config.py
  • Cost_Sum.py
  • Cost_State.py
  • Cost_Action.py
  • Cost_Utils.py
  • Config_ALG_BADMM.py
  • Config_Traj_Opt.py
  • Algorithm_Utils.py
  • Traj_Opt_Utils.py
  • Linear_Gauss_Policy.py
  • GPS_General_Utils.py
  • Policy.py
  • GMM.py

TODO:

  • Add fully working GMM.py.
  • Test MUJOCO Cart Pole and write function to extract states, send control and to control rendering.
  • Figure out what does the following function do class BundleType.
  • Figure out what does the following function do func extract_condition.
  • Add function to calculate nominal trajectories using state dynamic matrices given by GMM.
  • Install TensorFlow GPU in both Python 2 and Python 3.
  • Install MUJOCO on PC.
  • In GitHub add result folder and simulator results.
  • Write Agent.py files.
  • Recheck the files required for Trajectory Optimization.
  • Add function to calculate the nominal trajectory (i.e. iLQR optimized) using state dynamic matrices given by GMM.
  • Read more about Agent.py file. What is the function of this file? Where it is called?
  • Write the Cost.py files. Look how cost functions are written and how they are generalized?
  • Write Hyperparameter.py files.
  • Understand how Traj_Opt_Utils.py computes KL-Divergence.
  • Do we need noise patterns in the LinearGaussianPolicy.py file?
  • Figure out what is Policy GMM and why are they are using it? Where does the code for Policy go? Where it is called?
  • Figure out what these files do: gps.py (main file).
  • Figure out what these files do: PolicyOptCaffe.py (then modify it into tensorflow based policy).
  • Figure out what these files do: gps.gui.config.py (Rendering plus GUI can we make our code without using this?).
  • Figure out what these files do: gps.proto.gps_pb2.py (can we make our code without using this?).
  • Add information regarding CartPole simulator in Readme file.

About

Guided Policy Search on Hopping task

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 53.5%
  • Jupyter Notebook 17.5%
  • Python 16.4%
  • Makefile 8.7%
  • CMake 2.8%
  • C 0.7%
  • Other 0.4%