Skip to content

Harishgeth/cookietts

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repo does not work!

Check back in a week. Thanks.

Missing stuff;

  • Pitch preprocessing
  • Energy preprocessing
  • Normalize transcripts before running Montreal Forced Aligner
  • Tacotron2, Flow-TTS and WaveGlow support for .json filelists.
  • Flow-TTS (pretrained model and guided Attention)
  • WaveFlow (the entire thing)

Install/Setup:

Clone this repo: https://github.com/CookiePPP/cookietts.git

This will make a folder called cookietts where the command is run, and clone the repo into said folder.

Run cd cookietts

This will move you into the cloned repo.

Run pip install -e .

This will 'install' the package (without moving around any files).

Run pip install -r requirements.txt

This will install dependencies.

e.g: pytorch, numpy, scipy, tensorboard, librosa

Install Nvidia/Apex.

Nvidia/Apex contains the LAMB optimizer and faster running fused optimizers.

This also allows for fp16 (mixed-precision) training, which saves VRAM and runs faster on RTX cards.

(please nag me if you cannot install Apex. Pytorch added native fp16 (mixed-precision) support some time ago and I might be able to set that up as an alternative)

That should be the main stuff.

If something fails during preprocessing, check you have ffmpeg and sox installed as well.

If something else fails, you can create an 'issue' on the top of the github page.


Usage:

Head into the CookieTTS folder and read around.

If you want to train custom multispeaker models then folders 0, 1 and 2 are of interest.

If you want to use an already trained model to generate new speech then folder 5 (_5_infer) is of interest.

About

TTS from Cookie. Messy and experimental!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.7%
  • Python 1.3%