Skip to content

myorm00000000/world_merlin

 
 

Repository files navigation

world_merlin

To build a new voice with the merlin toolkit and using the clustegen's question set: ###Simple Steps: ####0. Copy this folder into the FESTVOXDIR/src/world_merlin

####1. Setup environment variables:

export ESTDIR=/path/to/speech_tools
export FESTVOXDIR=/path/to/festvox
export SPTKDIR=/path/to/SPTK
THEANO_FLAGS="floatX=float32"
export THEANO_FLAGS
PYTHONPATH=:/usr/lib/python2.7/dist-packages
export PYTHONPATH

####2. Make a new voice directory and set up the initial directory structure

mkdir <institute>_<lexicon>_<voicename>
example: mkdir cmu_us_pnb
cd cmu_us_pnb
$FESTVOXDIR/src/world_merlin/setup_world_merlin cmu us pnb

For Indic languages do:

$FESTVOXDIR/src/world_merlin/setup_world_merlin_indic cmu indic <lang> pnb

where lang is any of asm ben guj hin kan mar pan raj tam tel

####3. Copy the transcript in the festival format:

cp <TRANSCRIPT_DIR>/txt.done.data etc/txt.done.data

It needs to be named txt.done.data and must be of the format: ( wavfile_name "Transcription of wavefile." ) eg:( arctic_a0001 "Author of the danger trail, Philip Steels, etc." )

####4. Copy wav files from your directory and power normalize:

./bin/get_wavs <WAVDIR>/*.wav

####5. Remove extra silences optionally. Remove trailing and leading silences:

./bin/prune_silence wav/*.wav

Remove middle silences:

./bin/prune_middle_silences wav/*.wav

####6. Run the voice building script.

./bin/build_merlin_world_voice

#####Note the last step in the above script assumes the default location of trained neural network model and its name.

Steps in build_merlin_world_voice:-Continue after step 5 above.

####6. Dump aligned WORLD feats with CLUSTERGEN's features:

./bin/dump_world_feats

####7. Make train/test/val splits.

./bin/make_file_id_list.sh `pwd`  

####8. Setup the configuration file

./bin/setup_conf.sh ss_dnn/feed_forward_dnn_WORLD_template.conf

####9. Train DNN

python ss_dnn/merlin_scripts/src/run_dnn.py ss_dnn/feed_forward_dnn_WORLD.conf

####10. Resynthesize wavefiles

MODEL_NAME=`cat etc/gen_model_file_name`
./bin/merlin_resynthesis.sh ss_dnn/gen/$MODEL_NAME

About

build voice using merlin and WORLD vocoder using clustergen linguistic features

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 82.6%
  • Scheme 11.6%
  • Shell 5.8%