Skip to content

ZhangAustin/attention-lvcsr

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Attention-based Speech Recognizer

The reference implementation for the paper

End-to-End Attention-based Large Vocabulary Speech Recognition. Dzmitry Bahdanau, Jan Chorowski, Dmitriy Serdyuk, Philemon Brakel, Yoshua Bengio.

(arxiv draft, submitted to ICASSP 2016).

How to use

  • install all the dependencies (see the list below)
  • set your environment variables by calling source env.sh

Then, please proceed to exp/wsj for the instructins how to replicate our results on Wall Street Journal (WSJ) dataset (available at the Linguistic Data Consortium as LDC93S6B and LDC94S13B).

Dependencies

  • Python packages: pykwalify, toposort, pyyaml, numpy, pandas, pyfst
  • kaldi
  • kaldi-python

Given that you have the dataset in HDF5 format, the models can be trained without Kaldi and PyFst

Subtrees

The repository contains custom modified versions of Theano, Blocks, Fuel, picklable-itertools, Blocks-extras as [subtrees] (http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/). In order to ensure that these specific versions are used, we recommend to uninstall regular installations of these packages if you have them installed in addition to sourcing env.sh.

License

MIT

About

End-to-End Attention-Based Large Vocabulary Speech Recognition

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 91.3%
  • Cuda 5.5%
  • C 1.2%
  • TeX 1.1%
  • C++ 0.5%
  • Shell 0.4%