Skip to content

Using Keras to build a classifier for speech. Not necessarily a speaker identifier

License

Notifications You must be signed in to change notification settings

dare0021/KerasBasedSpeechClassifier

Repository files navigation

Not an active project.

KerasBasedSpeechClassifier

Using Keras to build a classifier for speech. Not necessarily a speaker identifier

Setup

Python 2.7

numpy is required

Wav files excluded from repo due to possible legal issues.

How to use

Configure boot.py to use the branch and input that you want.

Or use rnn_mfcc.py and rnn_raw.py directly.

  1. MFCC based branch

    i) Use CMU Sphinx 4 to generate MFCC feature vector files from your audio.

    ii) Configure rnn_mfcc.py to contain the RNN that you want (or use as provided)

    iii) Configure mfcPreprocessor.py to reflect your data set. By default, it uses the format _[M,F].mfc, where M and F are the two classes available.

    iv) Run via boot.py or rnn_mfcc.py directly.

    modelPlayer.py can be used to play with saved models

  2. Raw wav data approach

This branch is currently not working. (Results similar to random guessing.)


Contains 2 branches, neither finished:

  1. MFCC based branch

rnn_mfcc.py

Uses a MFCC feature set. MFCC being created via CMU Sphinx 4.

  1. Raw wav data approach

rnn_raw.py

Uses data from uncompressed mono wav files. Because that's the corpus we have.


License: MIT

About

Using Keras to build a classifier for speech. Not necessarily a speaker identifier

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages