Skip to content

lscaria/char_to_phone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

char_to_phone

This repo contains a simple Seq2Seq model for character to phoneme conversion.

Preprocessing:

The CMU pronouncation dictionary is used as the input dataset. All files can be found in the data folder.

The CMU dictionary is then transformed to tfRecords for the input to our seq2seq model. Each tfRecord Example contains a word, the phoneme conversion, and the length of the conversion.

To create the processed dataset, run the below line, which will create train, dev, and test splits.

python create_tfrecords.py

Training:

The models we use for training are contained in models.py. When you run the training script, it runs the train and dev datasets for one epoch before printing out the loss and accuracy for both.

To train the model, run:

python train.py

The training, automatically updates the checkpoint file at the end of each epoch. Additionally, if you want to start training from a checkpoint, set line 61:

restore = True

otherwise it will start training a new model.

Inference:

To test on your own data, run:

python test.py

As of now you have to manually update the word argument in line 25 to be the word you want the model to run inference on. This will change to a commandline interface soon.

Acknowledgments:

I created this project because I wanted a clearer understanding of seq2seq models and I was unable to find a tutorial which showed a simple version with different train, dev, and test graphs as the Tensorflow NMT tutorial shows. Additionally I wanted an updated tutorial which took advantage of the tf.data libaries.

This small repo was inspired by the Prononcing English Gradients kaggle notebook, with elements from the Tensorflow NMT Tutorial and Park Chansung's Medium post

About

character to phoneme RNN

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages