Skip to content

rpardas/ds-course-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data science and machine learning projects

Resources, exercises and projects from the 'Applied ML and Data Science with Python' course (March, 2020) at Emory University (tought by Sridhar Palle, Ph.D spalle@emory.edu)

These are primarily Jupyter notebooks using Anacaonda.

bank_marketing

(Capstone project)

We will use a marketing/banking dataset obtained from the UCI Machine Learning repository - https://archive.ics.uci.edu/ml/datasets/Bank+Marketing

The dataset is related to phone call marketing campaigns of Portugese banking institutions.

The goal is to find the most accurate model that predicts whether the client will subsribe to a term deposit or not. The target variable, y is a yes/no.

We will use sklearn for

  • pre-processing,
  • splitting data for train/test
  • comparing 4 models (DummyClassifier, LogisticRegression, DecisionTreeClassifier, RandomForestClassifier)
  • comparing metrics (confusion matrix, accuracy, recall, f1, precision, auc)

numpy_pandas_intro

  • numpy.ipynb

    A walkthrough of common numpy features

  • pandas.ipynb

    A walkthrough of common pandas features

  • numpy_pandas_in_practice.ipynb

    A few samples of what numpy and pandas can do

supervised_ml

  • sml-classification-exercise.ipynb

    Classification of diabetes dataset

  • sml-classification.ipynb

    Explore a breast cancer dataset with sklearn and supervised ML

  • sml-classification.ipynb

    Explore a sklearn diabetes dataset with regression.

About

Data science and machine learning projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published