GitHub - jayurbain/machine-learning: Machine learning course materials

Machine Learning

This course provides an introduction to machine learning. Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from example inputs in order to make data-driven predictions or decisions, rather than following strictly static program instructions.

Topic categories include supervised, unsupervised, and reinforcement learning. Students will learn how to apply machine learning methods to solve problems in computer vision, natural language processing, classification, and prediction. Fundamental and current state-of-the-art methods including boosting and deep learning will be covered. Students will reinforce their learning of machine learning algorithms with hands-on tutorial oriented laboratory exercises using Jupyter Notebooks.

Prerequisites: MA-262 Probability and Statistics; programming maturity, and the ability to program in Python.

Helpful: CS3851 Algorithms, MA-383 Linear Algebra, Data Science.

ABET: Math/Science, Engineering Topics.

2-2-3 (class hours/week, laboratory hours/week, credits)

Lectures are augmented with hands-on tutorials using Jupyter Notebooks. Laboratory assignments will be completed using Python and related data science packages: NumPy, Pandas, ScipPy, StatsModels, Scikit-learn, Matplotlib, TensorFlow, Keras, PyTorch.

Outcomes:

Understand the basic process of machine learning.
Understand the concepts of learning theory, i.e., what is learnable, bias, variance, overfitting.
Understand the concepts and application of supervised, unsupervised, semi-supervised, and reinforcement learning.
The ability to analyze a data set including the ability to understand which data attributes (dimensions) affect the outcome.
Understand the application of learned models to problems in classification, prediction, clustering, computer vision, and NLP.
Understand deep learning concepts and architectures including representation learning Multi-layer Perceptrons, Convolutional Neural Networks, Recurrent Neural Networks, and Attention Mechanisms.
The ability to assess the quality of predictions and inferences.
The ability to apply methods to real world data sets.

References:

*Hands-On Machine Learning with Scikit-Learn and TensorFlow Concepts, Tools, and Techniques to Build Intelligent Systems (MLSLT), Aurélien Géron. O'Reilly Media, 2017

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition, by Aurélien Géron, Publisher: O'Reilly Media, Inc. Release Date: June 2019 ISBN: 9781492032649

*Deep Learning with Python (DLP), François Chollet. Manning, 2017.

Deep Learning (DL), Ian Goodfellow, Yoshua Bengio, and Aaron Courville. MIT Press, 2016.

An Introduction to Statistical Learning: with Applications in R (ISLR), Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. 2015 Edition, Springer.

Python Data Science Handbook (PDSH), Jake VanderPlas, O'Reilly.

Mining of Massive Datasets (MMDS). Anand Rajaraman and Jeffrey David Ullman. http://www.mmds.org/

Week 1: Intro to Machine Learning

Lecture:

Demonstrations
Reading: MLSLT Ch. 1

Introduction to Git and GitHub

Reference: git - the simple guide

Machine Learning Foundations

End to end machine learning
Image Classification Using Deep Learning
Back Pain
Vehicle Detection
Reading: MLSLT Ch. 2

Lab Notebooks:

Using Jupyter Notebooks
Python Programming for Data Science Submission required
Python Numpy Submission required

Optional tutorial notebooks:

Outcomes addressed in week 1:

Understand the basic process of machine learning:
Understand the concepts and application of supervised, unsupervised, semi-supervised, and reinforcement learning.

Week 2: Linear Regression, Multivariate Regression

Lecture:

Linear Regression 1

Reading: PDSH Ch. 5 p. 331-375, 390-399
Reading: ISLR Ch. 1, 2

Linear Regression Notebook Use for second lecture

Normal Equation Derivation
Reading: ISLR Ch. 3
Reading: PDSH Ch. 5 p. 359-375

Generalized Linear Models Notebook Optional

Lab Notebooks:

Introduction to Machine Learning with Scikit Learn
Supervised Learning Linear Regression Submission required
Bike Demand Linear Regression Optional extra credit submission

Outcomes addressed in week 4:

The ability to analyze a data set including the ability to understand which data attributes (dimensions) affect the outcome.
The ability to perform basic data analysis and statistical inference.
The ability to perform supervised learning of prediction models.
The ability to perform data visualization and report generation.
The ability to apply methods to real world data sets.

Week 3: Introduction to Classification, KNN, Model Evaluation and Metrics. Logistic Regression

Lecture:

Linear Model Selection and Regularization

Reading: ISLR Ch. 6

Introduction to Machine Learning with KNN

Reading: ISLR Ch. 4.6.5

Logistic Regression Classification

Reading: ISLR Ch. 4

Lab Notebooks:

Gradient Descent Submission required
Linear Regression - Recursive Feature Elimination - Bike sharing data set Submission required

Outcomes addressed in week 4:

The ability to assess the quality of predictions and inferences.
The ability to apply methods to real world data sets.
The ability to perform supervised learning of prediction models.

Week 4: Logistic Regression, Model Selection and Regularization, ROC

Lecture:

Logistic Regression Classification

Reading: ISLR Ch. 4

Model Evaluation and Metrics, ROC

Scikit-learn ROC Curve notebook
Reading: PDSH Ch. 5 p. 331-375, 390-399
Reading: ISLR Ch. 5

Regularization and overfitting

Lab Notebooks:

Supervised Learning - Logistic Regression Submission required
Multinomial Image Classification submission required
If you're having trouble reading MNIST from mldata use the following notebook to load the data:
Write Read MNIST
mnist_data.csv
mnist_target.csv

Outcomes:

Understand the concepts and application of supervised, unsupervised, semi-supervised, and reinforcement learning.
The ability to analyze a data set including the ability to understand which data attributes (dimensions) affect the outcome.
Understand the application of learned models to problems in classification, prediction, clustering, computer vision, and NLP.
The ability to assess the quality of predictions and inferences.
The ability to apply methods to real world data sets.

Week 5: Decision Trees, Bagging, Random Forests

Lecture:

Decision Trees

Reading: PDSH Ch. 5 p. 421-432
Reading: ISLR Ch. 8.1
Information Gain Calculation Spreadsheet

Bagging, Random Forests, Boosting

Reading: PDSH Ch. 5 p. 421-432
Reading: ISLR Ch. 8.2

Lab Notebooks:

Decision Trees submission required

Outcomes:

Understand the concepts and application of supervised, unsupervised, semi-supervised, and reinforcement learning.
The ability to analyze a data set including the ability to understand which data attributes (dimensions) affect the outcome.
Understand the application of learned models to problems in classification, prediction, clustering, computer vision, and NLP.
The ability to assess the quality of predictions and inferences.
The ability to apply methods to real world data sets.

Week 6: Boosting, XGBoost, Midterm

Lecture:

Lab Notebooks:

Random Forests recommended for study, submission not required
Random Forests and Gradient Boosting recommended for study, submission not required
Ensembling optional, not required

Outcomes:

Understand the concepts and application of supervised, unsupervised, semi-supervised, and reinforcement learning.
The ability to analyze a data set including the ability to understand which data attributes (dimensions) affect the outcome.
Understand the application of learned models to problems in classification, prediction, clustering, computer vision, and NLP.
The ability to assess the quality of predictions and inferences.
The ability to apply methods to real world data sets.

Week 7: Introduction to Deep Learning and Backpropagation

Lecture:

Deep Learning Introduction 1
Reference for earlier in class:

Lab Notebooks:

NeuralNetworkIntro submission required
Additional Google Colab notebooks:
Install TensorFlow:
conda install -c conda-forge tensorflow

Complete the following TensorFlow "Learn and use ML" tutorials. Submit screen shot demonstrating complete of the tutorial along with feedback to Blackboard.

Get Started with TensorFlow

https://www.tensorflow.org/tutorials/

Train your first neural network: basic classification

https://www.tensorflow.org/tutorials/keras/basic_classification

Explore overfitting and underfitting

https://www.tensorflow.org/tutorials/keras/overfit_and_underfit

Optional:

MNIST with CNN Optional
Work on the other "Learn and use ML" tutorials. Optional
Introduction to TensorFlow Optional
Neural Network Fundamentals Optional

Reading:

DLP Chs. 1-4

Outcomes addressed in week 8:

Understand the concepts of learning theory, i.e., what is learnable, bias, variance, overfitting.
Understand the concepts and application of supervised, unsupervised, semi-supervised, and reinforcement learning.
Understand the application of learned models to problems in classification, prediction, clustering, computer vision, and NLP.
Understand deep learning concepts and architectures including representation learning Multi-layer Perceptrons, Convolutional Neural Networks, Recurrent Neural Networks, and Attention Mechanisms.

Week 8: Deep Learning for Computer Vision

Lecture:

Reading:

DLP Ch. 5

Lab Notebooks:

Setting up your environment:

First, do a git pull on the course repository.

Note: You may run the lab notebooks in Google Collab. You will need a Google account and will need to store datasets in your own Google Drive folder. https://colab.research.google.com/notebooks/welcome.ipynb

Use the follow procedure when opening one of our course notebooks in Collab:

Select: NEW PYTHON 3 NOTEBOOK
Once the notebook opens, select EDIT, NOTEBOOK SETTINGS, and select GPU. You can also select Python 3 here as well.
Select FILE, OPEN, UPLOAD and upload your notebook.
For the notebooks that use the cat and dog images, you will need to upload those files to a folder on your Google Drive.

If you want to run tensorflow and keras locally, use the following installation procedure:

Tensorflow Keras Jupyter Install Cheat Sheet

Notebooks:

Keras Neural Network Intro Review only
Convnets with Small Datasets Submission required
Image Classification Extra Credit

Optional material:

Outcomes addressed in week 9:

Understand the concepts of learning theory, i.e., what is learnable, bias, variance, overfitting.
Understand the application of learned models to problems in classification, prediction, clustering, computer vision, and NLP.
Understand deep learning concepts and architectures including representation learning Multi-layer Perceptrons, Convolutional Neural Networks, Recurrent Neural Networks, and Attention Mechanisms.

Week 9: Deep Learning for NLP

Lecture:

NLP Classification
Optional Convnets for Structured Prediction

Lab Notebooks:

NLP Classification Submission required
NLP Translation Extra Credit

Outcomes addressed in week 9:

Understand the concepts of learning theory, i.e., what is learnable, bias, variance, overfitting.
Understand the concepts and application of supervised, unsupervised, semi-supervised, and reinforcement learning.
Understand the application of learned models to problems in classification, prediction, clustering, computer vision, and NLP.
Understand deep learning concepts and architectures including representation learning Multi-layer Perceptrons, Convolutional Neural Networks, Recurrent Neural Networks, and Attention Mechanisms.

Week 10: Generative Deep Learning

Lecture:

NLP Translation
Deep Learning Trends NLP [DeepLearningTrendsNLP2019]
Final Review Study Guide

Lab Notebooks:

Complete assignments

Outcomes addressed in week 10:

Understand the concepts of learning theory, i.e., what is learnable, bias, variance, overfitting.
Understand the concepts and application of supervised, unsupervised, semi-supervised, and reinforcement learning.
Understand the application of learned models to problems in classification, prediction, clustering, computer vision, and NLP.
Understand deep learning concepts including representation learning.

Final Exam: Monday, 8-10AM, S243.

Thomas More Catholic Church, 215 Thomas More Drive, Elgin

where she was a parishioner. Visitation will be held on Friday morning at the church from 9:30 AM until the time of mass. In lieu of flowers, the family requests donations be made to Ronald McDonald House.

Name		Name	Last commit message	Last commit date
Latest commit History 148 Commits
.idea		.idea
code		code
data		data
handouts		handouts
keras		keras
notebooks		notebooks
pgmpy_notebook		pgmpy_notebook
slides		slides
README.md		README.md
syllabus.docx		syllabus.docx
syllabus.pdf		syllabus.pdf
tensorflow_keras_jupyter_install_cheat_sheet.txt		tensorflow_keras_jupyter_install_cheat_sheet.txt

jayurbain/machine-learning

Folders and files

Latest commit

History

Repository files navigation

Machine Learning

Week 1: Intro to Machine Learning

Lecture:

Lab Notebooks:

Optional tutorial notebooks:

Week 2: Linear Regression, Multivariate Regression

Lecture:

Lab Notebooks:

Week 3: Introduction to Classification, KNN, Model Evaluation and Metrics. Logistic Regression

Lecture:

Lab Notebooks:

Week 4: Logistic Regression, Model Selection and Regularization, ROC

Lecture:

Lab Notebooks:

Week 5: Decision Trees, Bagging, Random Forests

Lecture:

Lab Notebooks:

Week 6: Boosting, XGBoost, Midterm

Lecture:

Lab Notebooks:

Week 7: Introduction to Deep Learning and Backpropagation

Lecture:

Lab Notebooks:

Week 8: Deep Learning for Computer Vision

Lecture:

Lab Notebooks:

Week 9: Deep Learning for NLP

Lecture:

Lab Notebooks:

Week 10: Generative Deep Learning

Lecture:

Lab Notebooks:

About

Resources

Stars

Watchers

Forks

Languages