Transformer-TTS (and more!)

Neural Speech Synthesis with Transformer Network
Each phoneme has a trainable embedding of 512 dims
the output of each convolution layer has 512 channels, followed by a batch normalization and ReLU activation, and a dropout layer as well.
we add a linear projection after the final ReLU activation

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
ParallelWaveGAN @ 88a829b		ParallelWaveGAN @ 88a829b
model		model
scripts		scripts
settings		settings
utils		utils
waveglow @ d18e0f3		waveglow @ d18e0f3
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
font_manage.py		font_manage.py
main.py		main.py
preprocess.py		preprocess.py
seperation.py		seperation.py
waveglow_inference.py		waveglow_inference.py

Joovvhan/transformer-tts