Skip to content

chenyz0601/mmd-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is the repository of the project of Mining Massive Data.
Authors: Yuanze Chen, Alex
Time: 25/10/2016

Data

subset of Million Song Dataset, 10000 songs (compressed 1.8G).

Project 1: duplicate detection

using Locality Sensitive Hashing and cosin distance.

Project 2: song recommendation (part 1)

using Latent Factor Models.
using alternative optimization to find the latent factor of the user-song-count matrix.

Project 3: song recommendation (part 2)

using Gradient Dencent, SGD and mini-batch SGD to solve the latent factor problem.

Project 4: song ranking

compute the song similarity, and build a similarity network of songs.
using Topic-Specific PageRank to rank songs.

Project 5: song clustering

using network in project4 to construct a weighted adjacency matrix.
performing spectral clustering on it, support normalized and un-normalized graph Laplacian.