For Quest. Let's discuss the use of git for this.
Notes on using neural nets:
Could use evolutionary nets - randomly generate them and then make new generations based on the best preforming ones. Works for all models (neural nets aren't really great for this)
Could train on examples of humans playing based on whether or not they won (and negatively gradient if its an example of a loss!)
Could train on examples of whether or not a random model won or lost.
Needs Q-Learning probs. Should make Q-Learning (all of the above are pretty iffy and not bueno)