Skip to content

TaivasZhang/Song.Recommender

Repository files navigation

Welcome to the Song Recommender

In this repository sat a project that exploited the Million Song Dataset to build a Song Recommendation Engine. Other than other recommendation systems that either used content based or collaborative filtering techinques, this Recommender focuses on the intrinsic features embedded in the beats, rhythms, and vocals of a song. Such features can be extracted using the mel frequency cepstral coefficients (MFCCs).

The repository contains the following files:

The Gather & JSON .py Files

These 2 .py files requires a python 2.7 environment. The database is stored on AWS as a HDF5 format. After the data is mounted to my Google Cloud bucket, it has to be transformed into a different format. While the hdf5_gathers.py is used to extract the features, the hdf5_json_all.py tranfroms the features into JSON format. To run these scripts, just simply type the command in your terminal:

python hdf5_json_all.py

The EDA Notebook

In this notebook, we created some visualizations for the dataset. The size of this notebook is large as it shows the plots. Please download and and open it on your local laptop.

The Recommender & the Result GUI NoteBook

The Recommender notebook contains the main part of our project, data modeling. We used pySpark to handle the big data. The Result GUI shows a GUI demo of the recommender using a sample of the results.

Learn more about this Project.

Copyright © 2020 by Skye Zhang. All rights reserved.

About

A project that exploited the Million Song Dataset to build a Song Recommendation Engine based on the beats, rhythms, and vocals of the music using MFCC.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published