Skip to content

badpaper/coursework

 
 

Repository files navigation

Coursework for MIDS Scaling Up! Really Big Data

This is an index of coursework for the MIDS class "Scaling Up! Really Big Data". Please submit corrections if you find problems in the assignments. Submissions should be well-formed git pull requests.

Week 2: Cloud Computing 101

Labs

  1. Salt States and Docker deployment of the ELK stack

Week 3: Openstack Introduction

Labs

  1. Hadoop over OpenStack DevStack using Sahara

Week 4: Distributed Filesystems

Homework

This is a graded homework

  1. Part 1- GPFS setup
  2. Part 2- The Mumbler

Labs

There will be no in-class lab for this assignment

Week 5: Distributed Filesystems

Homework

  1. Part 1- Hadoop v1 Setup
  2. Part 2- Hadoop v2 Setup

Labs

(Complete the following in order)

  1. Load Google 2-gram dataset into HDFS
  2. Preprocess 2-gram data for Mumbler

Week 6: Apache Spark

Homework

  1. Apache Spark Introduction

Labs

  1. Machine Learning with Spark and MLLib

Week 7: Object Storage

Homework

  1. Object Storage

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 56.2%
  • SaltStack 33.2%
  • JavaScript 8.2%
  • Scheme 1.9%
  • Shell 0.5%