Jointly learning to align and convert graphemes to phonemes with neural attention models

Grapheme-to-Phoneme (G2P) conversion using attention based encoder-decoder models

Dependencies

Tensorflow == 1.0.0
Bunch
Editdistance

Evaluation Datasets

We used the following datasets provided by Stanley Chen (stanchen@us.ibm.com):

CMUDict
Pronlex
NetTalk

Note - For CMUDict, it might be a good idea to use the newer version from here - https://raw.githubusercontent.com/cmusphinx/cmudict/master/cmudict.dict

Steps

Prepare data:

python data_utils.py -data_dir DATA_DIR [-{train,dev,test}_file] {TRAIN,DEV,TEST}_FILE

Train/Eval models

python g2p.py -data_dir DATA_DIR -tb_dir BASE_MODEL_DIR [-eval]

Reference

Jointly learning to align and convert graphemes to phonemes with neural attention models by Shubham Toshniwal and Karen Livescu.

Here's the [BIBTEX] entry for citation ease.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
attn_decoder.py		attn_decoder.py
data_utils.py		data_utils.py
decoder.py		decoder.py
encoder.py		encoder.py
g2p.py		g2p.py
seq2seq_model.py		seq2seq_model.py
simple_decoder.py		simple_decoder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

attn_decoder.py

attn_decoder.py

data_utils.py

data_utils.py

decoder.py

decoder.py

encoder.py

encoder.py

g2p.py

g2p.py

seq2seq_model.py

seq2seq_model.py

simple_decoder.py

simple_decoder.py

Repository files navigation

Jointly learning to align and convert graphemes to phonemes with neural attention models

Dependencies

Evaluation Datasets

Steps

Reference

About

Releases

Packages

Languages

entn-at/g2p-1

Folders and files

Latest commit

History

Repository files navigation

Jointly learning to align and convert graphemes to phonemes with neural attention models

Dependencies

Evaluation Datasets

Steps

Reference

About

Resources

Stars

Watchers

Forks

Languages