Intelligent agents and decision making course spring 2016
- Markov Decision Processes: simulators
- Finite Horizon Optimization: value iteration
- Infinite Horizon Optimization: value iteration and policy iteration
- Bandit Algorithms: incremental uniform, UCB, epsilon greedy
- Reinforcement Learning: Q-learning