Build a mini stratego AI
Tasks: overall strategy
- heuristic minmax complete -> bootstrap to reinforce
- program arena for pitching AIs against each other
heuristic minmax agent
- probability dist sampling
- tweak or learn evaluation heuristic
- construct dataset of games with minmax policy
reinforce agent
- reinforce small game
- reinforce complete game integration
- can only act out legal moves (probability over legal moves)