GitHub - thomberg1/NeuralSpeechRecognition: Experiments with CNN, Deepspeech2 and RNN models.

Neural Speech Recognition

This repository contains experiments with CNN, Deepspeech2 and RNN models with different datasets.

The follwing results have been created with the AN4 dataset:

(orange:ResNet, blue:ResNet+augmentation, dark red: Deepspeech, light blue: Deepspeech+augmentation, ligth red: EncoderDecoder, green: EncoderDecoder+augmentation, grap: EncoderDecoder+augmentation+pseudo labels)

ResNet CNN + CTC

Bleu: 70.180 WER: 16.482 CER: 9.792 ACC: 49.231

Deepspeech 2 + CTC (modified from Sean Naren's deepspeech.pytorch repository)

Bleu: 89.890 WER: 5.094 CER: 2.993 ACC: 76.923 (note that the net is overfitting)

Encoder Decoder RNN

Bleu: 83.770 WER: 7.529 CER: 5.735 ACC: 72.308

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
lib		lib
resources		resources
AN4_CNN_CTC.ipynb		AN4_CNN_CTC.ipynb
AN4_DEEPSPEECH2_CTC.ipynb		AN4_DEEPSPEECH2_CTC.ipynb
AN4_SPEECH_ENCODER_DECODER.ipynb		AN4_SPEECH_ENCODER_DECODER.ipynb
AN4_audio_augmentation_normalization.ipynb		AN4_audio_augmentation_normalization.ipynb
AN4_dataset.ipynb		AN4_dataset.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib

lib

resources

resources

AN4_CNN_CTC.ipynb

AN4_CNN_CTC.ipynb

AN4_DEEPSPEECH2_CTC.ipynb

AN4_DEEPSPEECH2_CTC.ipynb

AN4_SPEECH_ENCODER_DECODER.ipynb

AN4_SPEECH_ENCODER_DECODER.ipynb

AN4_audio_augmentation_normalization.ipynb

AN4_audio_augmentation_normalization.ipynb

AN4_dataset.ipynb

AN4_dataset.ipynb

README.md

README.md

Repository files navigation

About

Releases

Packages

Languages

thomberg1/NeuralSpeechRecognition

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Languages