Skip to content

ASR & TTS joint training, asr, tts, machine speech chain

Notifications You must be signed in to change notification settings

fengpeng-yue/ASRTTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploring Machine Speech Chain for Domain Adaptation

This is an implementation of the paper, based on the ESPnet. If you have any questions, please email to me(11930381@mail.sustech.edu.cn).

Requirements

Follow the installation method of espnet.
You should use torch==1.7.1.

Pretraining

You should download LibriSpeech and LibriTTS manually.
LibriSpeech: run ./pretrain_asr.sh under egs/librispeech/asr (The recipe train ASR model on LibriSpeech train-clean-460)
LibriTTS: run ./pretrain_tts.sh under egs/libritts/tts (The recipe train TTS model on LibriTTS train-clean-460)

Adaptation training

You should download TED-LIUM-1 manually. We give the punctuated TED_LIUM text under egs/tedlium/data path.
Execution directory(egs/tedlium/asrtts):
Run ./prepare_data.sh for preparing json file for training, and then run ./joint_training.sh for joint training.

Experimental options in joint_training.sh for the three-stage training

Stage 1:

update_asr=true
update_tts=false
update_tts2asr=true
filter_data=true
filter_thre=0.58
unpaired_aug=true

Stage 2:

asrexpdir= # change the path of asr baseline to the asr adaptation
update_asr=false
update_tts=true
update_tts2asr=true
filter_data=false
unpaired_aug=flase
tts_loss_weight=0.005

Stage 3:

ttsexpdir= # change the path of tts baseline to the tts adaptation
update_asr=false
update_tts=true
update_tts2asr=true
filter_data=true
filter_thre=0.58
unpaired_aug=true

About

ASR & TTS joint training, asr, tts, machine speech chain

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published