CS285 Reinforcement Learning Final Project 2019 - SUBBAT (Subgoals Under Biologically Based Action Trajectories) Calvin T Chi, Ryan Moughan, Madeleine C Snyder
Based on: https://arxiv.org/pdf/1604.06057.pdf
Aim 1: Replicated paper's model
Aim 2: Change the structure of the subgoal --> metacontroller --> controller flow