This project contains code from the Apache-Spark Berkeley CS190.1x course at Berkeley.
Used Apache Spark to learn the underlying statistical and algorithmic principles required to develop scalable real-world machine learning pipelines - Exploratory data analysis, feature extraction, supervised learning, and model evaluation. Learned to implement scalable algorithms for fundamental statistical models.