ReChord

A Tool for Chord Sequence Detection

Motivation

Chord Transcription is a skill of detecting chord progressions in a musical composition by ear. Acquiring this skill is time-consuming and quite daunting for most musicians, which is why the majority use sheet music to learn songs.

Even though it is easy to find accurate sheet music or tabs for the classics, that is not the case for newly released or more obscure music.

Therefore, I built ReChord - web application that transcribes chords for you in just a few minutes!

I utilized my skills in Software Development, Deep Learning, Signal Processing, and Music Theory to create the application, and hope that it will prove to be useful for fellow musicians and music enthusiasts.

Learn more: Slides Demo

Requirements

You will need Python 3.7 or higher.

Please install dependencies:

Clone the repo with git clone https://github.com/belovm96/chord-detection
From the repo's root directory run pip3 install -r requirements.txt

If you would like to use ReChord App, you will need Docker, Streamlit, and FFmpeg. Otherwise, just install FFmpeg:

Usage

Both ReChord App and ReChord Command Line Tool require GPU on your machine!

Web App

Clone this repository
From the repo's root directory cd ChordDetection/app
Pull this docker image - docker pull tensorflow/tensorflow:latest-gpu
Create docker image of the app - docker image build -t streamlit:app .
Run the app - docker container run --gpus all -p 8501:8501 streamlit:app

Command Line

Clone this repository
Put a song that you would like to transcribe in data folder of the repo's root directory
To get chord transcriptions run python transcribe.py --song ./data/song name
- If your song name has white spaces, please enclose it with quotes, e.g. python transcribe.py --song ./data/'U2 - With Or Without You - Remastered.mp3'
- The script will ask you to provide time interval of the song that you would like to annotate
Chord - Time representations will be saved to annotations folder in png format

Example

Example usage can be found in notebooks. Also, you can follow the steps below to get an idea of how ReChord can be used as a script.

Run python transcribe.py --song ./data/'U2 - With Or Without You - Remastered.mp3'
Inference takes 1-3 minutes, depending on your GPU capabilities. During inference, the file will be converted to wav format and stored in the same directory as your input song.
Enter time interval 10:20
Chord - Time representation will be saved to annotations
Chord - Time representation can be perceived as follows:
- y-axis - chords to play
- x-axis - time in seconds
- each purple square corresonds to which chord to play every 0.1-second time step. In the example below, do not play any chord from 10 to 12.5 seconds and play D chord from 12.5 to 20 seconds of the song time)

Approach

My data → model → predictions pipeline can be summarized as follows:

First stage - Preprocessing

Short-time Fourier Transform algorithm is used to convert raw audio signal into the Spectrogram - Time - Frequency representation of the signal. Then, a filterbank with logarithmically spaced filters is applied to the spectrogram to scale the frequency axis values with the purpose of equalizing the frequency distance between each note in all areas of the spectrogram. Finally, logarithmic function is applied to the spectrogram values to compress the value range, and spectrogram is cut up into 0.1-second spectrogram frames in a sliding window fashion. These 0.1-second spectrogram frames are fed into the deep learning model for training/inference.

Second stage - Modeling

A Fully Convolutional Neural Network is trained on the log filtered spectrogram frames for chord prediction. However, these predictions are not used directly, since doing so ensures that the final chord sequence predictions are fragmented, which might confuse the end user of the application. Moreover, FCNN model does not exploit the fact that chords are always parts of chord progressions, i.e. sequences, losing a part of the potential predictive power. Therefore, Conditional Random Fields are introduced into the deep learning architecture to smooth out chord sequence predictions and to capture frame-to-frame dependencies between the predictions at every time step. The features extracted by the CNN are the inputs to the CRF, and the final chord sequence predictions are obtained using Viterbi Decoding.

Product Design

Tools

Packages & Tools used for development:

Docker
Streamlit
FFmpeg
Spotdl
Tensorflow
Keras
Tf2crf
NumPy
Seaborn
Matplotlib
Madmom
Yaml

Acknowledgements

[1] Filip Korzeniowski whose research paper was implemented and integrated into this application.
[2] Insight Artificial Intelligence Program for presenting me with an opportunity to work on this project, their guidance and support.

Name		Name	Last commit message	Last commit date
Latest commit History 399 Commits
ChordDetection		ChordDetection
annotations		annotations
configs		configs
data		data
notebooks		notebooks
static		static
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
transcribe.py		transcribe.py

License

belovm96/chord-detection

Folders and files

Latest commit

History

Repository files navigation

ReChord

Table of Contents

Motivation

Requirements

Usage

Web App

Command Line

Example

Approach

First stage - Preprocessing

Second stage - Modeling

Product Design

Tools

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages