Move 37

Introduction
Course Curriculum

Introduction

I am attending Siraj's Move 37 course through the School of AI. This repository contains my notes and assignments as I work through his course. I also use it to gather and track all resources Siraj provides us through his course.

Course Links

Weekly Notes

Homework Assignments

Course Curriculum

Course Objective

This is the syllabus for "Move 37", Siraj Raval's free reinforcement learning course, as part of School of AI. This course can be taken for free on Youtube in the form of a playlist, or at School of AI for a more immersive learning experience. Reinforcement learning is driving some of the latest advances in AI, from DeepMind's AlphaGo to OpenAI's DOTA bots. Although these AIs are designed for video games, reinforcement learning is a powerful branch of AI that can be applied to endless applications in the real world. In this course, we'll cover various RL techniques in order of increasing complexity, applying them to both simulated and real world problems. Students will develop an intuition around when to use certain RL algorithms and by the end of the course will have the practical skills necessary to apply RL to a problem they are passionate about to make a positive impact in the world.

Prerequisites

Understand Basic Python Syntax
Understand how the Backpropagation algorithm works.

Components

Midterm Project
Final Project
Educational Videos
Quizzes
Reading Assignments
Coding Assignments
Interviews
Group Discussion in Slack

Course Length

10 Weeks
10-15 hours of dedicated study per week
Starts September 10 at 12 PM PST

Tools Used

Pytorch & Tensorflow (Deep Learning Libraries in Python)
OpenAI Gym (Reinforcement Learning Library)
Google Colab (for free GPUs, no need to install/configure dependencies)

Week 1 - Introduction

Topics Covered

Markov Decision Processes, Policy Functions, Value Functions, and the Bellman Equation

Week 2 - Dynamic Programming

Route Planning
Options Pricing
Scheduling
Operating Systems

Topics Covered

Iterative Policy Evaluation, Policy improvement, Policy iteration, Value iteration

Week 3 - Monte Carlo Methods

Interview #1
Medical Diagnosis
Energy Efficiency
Physics Research

Topics Covered

Monte Carlo prediction, Monte Carlo control, Greedy & Epsilon-Greedy Policies , Exploration vs Exploitation Dilemma

Week 4 - Model-Free Learning

Delivery Management
Automated Trading
Backgammon
Dopamine in Neuroscience

Topics Covered

Temporal Difference Learning, SARSA, Q-Learning, Model vs Model Free Intuition

Week 5 - Reinforcement Learning in Continuous Spaces

Self Driving Cars
Delivery Drones
Rescue Robots
Assembly Robots

Topics Covered

Control Theory, Imitation Learning, The Hamilton-Jacobi-Bellman Equation, Kalman Filters

Midterm Project

Train a bipedal humanoid robot to walk in simulation!

Week 6 - Deep Reinforcement Learning

Traffic Optimization
Gaming
Meta Learning
Interview #2

Topics Covered

DQN + Double DQN Networks, Dueling DQN, Prioritized Replay, Value-based Methods for Robotics

Week 7 - Policy Based Methods

Web System Configuration
Text Summarization
AI Assisted Design
Portfolio Optimization

Topics Covered

Evolutionary Algorithms, Stochastic Policy Search, Policy Gradients, REINFORCE

Week 8 - Policy Gradient Methods

Dialogue Systems
Photo Editing
Language Translation
Tutoring Systems

Topics Covered

Evolved Policy Gradients, Generalized Advantage Estimation (GAE), Trust Region Policy Optimization, Proximal Policy Optimization (PPO)

Week 9 - Actor Critic Methods

Advanced Trading Techniques
Human-Machine Cooperation
Insurance Cost Analysis
Interview #3

Name		Name	Last commit message	Last commit date
Latest commit History 150 Commits
extras		extras
imgs		imgs
week01_markov_decision_processes		week01_markov_decision_processes
week02_dynamic_programming		week02_dynamic_programming
week03_monte_carlo_methods		week03_monte_carlo_methods
week04_model_free_learning		week04_model_free_learning
week05_rl_in_continuous_spaces		week05_rl_in_continuous_spaces
week06_deep_reinforcement_learning		week06_deep_reinforcement_learning
week07_policy_based_methods		week07_policy_based_methods
week08_policy_gradient_methods		week08_policy_gradient_methods
week09_actor_critic_methods		week09_actor_critic_methods
week10_multi_agent_rl		week10_multi_agent_rl
.gitignore		.gitignore
README.md		README.md
environment_move_37_20181123.yml		environment_move_37_20181123.yml
environment_move_37_week_10.yml		environment_move_37_week_10.yml
requirements_move_37_20181227.txt		requirements_move_37_20181227.txt

nbcmguarin0/move_37

Folders and files

Latest commit

History

Repository files navigation

Move 37

Introduction

Course Links

Weekly Notes

Homework Assignments

Course Curriculum

Course Objective

Prerequisites

Components

Course Length

Tools Used

Week 1 - Introduction

Topics Covered

Markov Decision Processes, Policy Functions, Value Functions, and the Bellman Equation

Week 2 - Dynamic Programming

Topics Covered

Iterative Policy Evaluation, Policy improvement, Policy iteration, Value iteration

Week 3 - Monte Carlo Methods

Topics Covered

Monte Carlo prediction, Monte Carlo control, Greedy & Epsilon-Greedy Policies , Exploration vs Exploitation Dilemma

Week 4 - Model-Free Learning

Topics Covered

Temporal Difference Learning, SARSA, Q-Learning, Model vs Model Free Intuition

Week 5 - Reinforcement Learning in Continuous Spaces

Topics Covered

Control Theory, Imitation Learning, The Hamilton-Jacobi-Bellman Equation, Kalman Filters

Midterm Project

Week 6 - Deep Reinforcement Learning

Topics Covered

DQN + Double DQN Networks, Dueling DQN, Prioritized Replay, Value-based Methods for Robotics

Week 7 - Policy Based Methods

Topics Covered

Evolutionary Algorithms, Stochastic Policy Search, Policy Gradients, REINFORCE

Week 8 - Policy Gradient Methods

Topics Covered

Evolved Policy Gradients, Generalized Advantage Estimation (GAE), Trust Region Policy Optimization, Proximal Policy Optimization (PPO)

Week 9 - Actor Critic Methods

Topics Covered

Actor Critic Algorithms, Asynchronous Advantage Actor Critic, Deep Deterministic Policy Gradients (DDPG), Bayesian Actor-Critic

Week 10 - Multi Agent Reinforcement Learning

Topics Covered

Cooperation, Competition, Parallelism, Inverse Reinforcement Learning

Final Project

About

Resources

Stars

Watchers

Forks

Languages