Learning Audio Features For Genre Classification with Deep Belief Networks

Feature extraction is a crucial part of many music information retrieval(MIR) tasks. In this project, I present an end to end deep neural network that extract the features for a given audio sample, performs genre classification. The feature extraction is based on Discrete Fourier Transforms (DFTs) of the audio. The extracted features are used to train over a Deep Belief Network (DBN). A DBN is built out of multiple layers of Randomized Boltzmann Machines (RBMs), which makes the system a fully connected neural network. The same network is used for testing with a softmax layer at the end which serves as the classifier. This entire task of genre classification has been done with the Tzanetakis dataset and yielded a test accuracy of 74.6%.

Technologies used:

Python3
Keras
TensorFlow
Theano
NumPy
Pickle
LibROSA
Google Cloud Platform (GCP)
Developed a deep learning model/application for audio genre classification. Performed inference step with Keras with TensorFlow backend (GPU).
Entire application was run on a GCP virtual machine.
Model accuracy 74%, real time performance on CPU (<10ms) and GPU (<5ms).

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Learning Audio Features For Genre Classification with Deep Belief Networks

Technologies used:

About

Releases

Packages

Languages

M10han/Masters-project

Folders and files

Latest commit

History

Repository files navigation

Learning Audio Features For Genre Classification with Deep Belief Networks

Technologies used:

About

Resources

Stars

Watchers

Forks

Languages