Skip to content

akhilpadgilwar/BigdataFinalProject

Repository files navigation

This folder has all the codes for the project.

  1. Data Preprocessing: This folder has two files. a) Data Preprocessing for Visualization.ipynb: It contains all the required preprocessing code for visualization. b) Merge&Sample: It contains all the required coding for downsampling and merging of data files (2016, 2017, 2018)

  2. Data Visualization: a) Final preliminary-data-visualization.ipynb: This file has all the coding and outputs for the data visualization.

  3. Database Analysis: a) Hive: This folder has all the analytical queries on data by using the Hive technique. All the necessary outputs are also uploaded. b) Spark: This folder has all the analytical queries on data by using the Spark technique. All the necessary outputs are also uploaded. c) Pig: This folder has all the analytical queries on data by using the Pig technique. All the necessary outputs are also uploaded.

  4. Recommender System:
    a) RecommendationSystemPySpark.py: This file contains data preprocessing specific to recommender system and implementation code for recommender system using ALS Recommender, PySpark, SparkMLLib.

About

Predictive Analytics for New York Parking Tickets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •