Skip to content

Code for the Million Song Dataset, the dataset contains metadata and audio analysis for a million tracks, a collaboration between The Echo Nest and LabROSA. See website for details.

License

glinmac/MSongsDB

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MILLION SONG DATASET

http://labrosa.ee.columbia.edu/millionsong/

January 2011


  • The dataset contains the analysis and metadata for a million songs. The goal is to provide a large dataset for researchers to report results on, hence encouraging algorithms that scale to commercial sizes.

  • Most of the information is provided by The Echo Nest. The dataset is the result of a collaboration between The Echo Nest and LabROSA at Columbia University. This project is funded in part by the NSF.

  • Most of the data is licensed the same way as Echo Nest's API.

    For the SecondHandSongs dataset (cover songs), see the webpage:

    http://labrosa.ee.columbia.edu/millionsong/secondhand

    For the musiXmatch dataset (lyrics), see the webpage:

    http://labrosa.ee.columbia.edu/millionsong/musixmatch

    The code is under GNU public license. See LICENSE for details.

  • Most details and instructions on how to get the dataset can be found on the project's website:
    http://labrosa.ee.columbia.edu/millionsong/


If you have any question or comment:

https://groups.google.com/forum/#!forum/millionsongdataset

About

Code for the Million Song Dataset, the dataset contains metadata and audio analysis for a million tracks, a collaboration between The Echo Nest and LabROSA. See website for details.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 77.5%
  • MATLAB 13.9%
  • Java 4.3%
  • C++ 4.1%
  • Makefile 0.2%