Skip to content

M10han/Masters-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 

Repository files navigation

Learning Audio Features For Genre Classification with Deep Belief Networks

Feature extraction is a crucial part of many music information retrieval(MIR) tasks. In this project, I present an end to end deep neural network that extract the features for a given audio sample, performs genre classification. The feature extraction is based on Discrete Fourier Transforms (DFTs) of the audio. The extracted features are used to train over a Deep Belief Network (DBN). A DBN is built out of multiple layers of Randomized Boltzmann Machines (RBMs), which makes the system a fully connected neural network. The same network is used for testing with a softmax layer at the end which serves as the classifier. This entire task of genre classification has been done with the Tzanetakis dataset and yielded a test accuracy of 74.6%.

Technologies used:

  • Python3

  • Keras

  • TensorFlow

  • Theano

  • NumPy

  • Pickle

  • LibROSA

  • Google Cloud Platform (GCP)

  • Developed a deep learning model/application for audio genre classification. Performed inference step with Keras with TensorFlow backend (GPU).

  • Entire application was run on a GCP virtual machine.

  • Model accuracy 74%, real time performance on CPU (<10ms) and GPU (<5ms).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages