A system for tagging songs based on their chord progressions.
HarmoniTag is a project and pipeline aimed at tagging songs, utilizing machine learning to infer genres by the appearances of chord progressions. Its first iteration was made as part of a CS workshop for B.Sc. It is currently under work, to refine the methods use for scientific publication. The data used is taken from the Million Song Dataset's Last.fm dataset, and various chord websites (such as chordie.com and ultimate-guitar.com).
The GitHub repository mostly contains python code for:
- A Django ORM database structure for storing song/tag/chord-progression data (database itself not included in the repository).
- Populating the database with song-tag data from the Last.fm dataset.
- Populating the database with song-chord data by querying and parsing relevant websites.
- Calculating chord progressions for the songs in the db, performing feature selection by mutual information against tags.
- Training and testing various models of machine learning on the data, extracting statistics.