Skip to content

Joovvhan/transformer-tts

Repository files navigation

Transformer-TTS (and more!)

We plans to implement some TTS Deep Learning algorithms here

  • Transformer-TTS
  • FastSpeech
  • FastSpeech2
  • ...

TODO

  • Build Dataset Loader
  • Compare mel-spectrogram processing/loading time (3:1)
  • Build a model and modules
  • Baseline model architecture
  • Tensorboard logging
  • requirements.txt or Docker image
  • overwrite configs with parsed arguments
  • Check why phoneme dictionary is of length 12463
  • Make phoneme dictionary process multi-threaded

SETUP

  1. git clone https://github.com/Joovvhan/transformer-tts.git
  2. cd transformer-tts
  3. source scripts/set_locale.sh
  4. source scripts/init.sh
  5. python main.py

Reference

  • Neural Speech Synthesis with Transformer Network
  • Each phoneme has a trainable embedding of 512 dims
  • the output of each convolution layer has 512 channels, followed by a batch normalization and ReLU activation, and a dropout layer as well.
  • we add a linear projection after the final ReLU activation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published