Skip to content

Ericbaba/MIT_dataiap

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

======= MIT_dataiap

MIT workshop on "How to Process, Analyze and Visualize Data"

Day 0: Class organization and programming environment setup.

Day 1: An end-to-end example getting you from a dataset found online to several plots of campaign contributions.

Day 2: Lots of visualization examples, and practice going from data to chart.

Day 3: Statistics basics, including T-Tests, Linear Regression, and statistical significance. We'll use campaign finance and per-county health rankings.

Day 4: Text processing on a large text corpus (the Enron email dataset) using tf-idf and cosine similarity.

Day 5: Scaling up to process large datasets using Hadoop/MapReduce on a larger copy of the Enron dataset.

About

MIT workshop on "How to Process, Analyze and Visualize Data"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 52.1%
  • Python 44.9%
  • CSS 2.8%
  • Makefile 0.2%