Stellenbosch University ZeroSpeech 2019 System

Please note: This code is currently in a very preliminary state, i.e. it would be hard to use out-of-the-box. We hope to clean it and make it more usable in the near future.

Overview

The ZeroSpeech challenges aim to answer the question of how we can build speech processing systems directly from speech audio without any labels. It has the dual motivation of understanding language acquisition in humans and developing technology for extremely low-resource languages. The task in ZeroSpeech 2019 is "TTS without T", i.e. text-to-speech without textual input. This is the repository for suzerospeech, the Stellenbosch University ZeroSpeech 2019 system.

Disclaimer

The code provided here is not pretty. But we believe that research should be reproducible. We provide no guarantees with the code, but please let us know if you have any problems, find bugs or have general comments.

Repository structure

docker/
data/ - Any data files that we produce or get from the challenge organisers.
features/ - Input features (MFCCs, filterbanks, etc.) are extracted here.
wavenet/ - WaveNet speech synthesis.
notebooks/
- vq_vae.ipynb
- cat_vae.ipynb
evaluation/
src/ - Mature source used in different parts of the project can be put here.

Docker

This recipe comes with Dockerfiles which can be used to build images containing all of the required dependencies. This recipe can be completed without using Docker, but using the image makes it easier to resolve dependencies. At the moment, we use a Dockerfile which is different from the Dockerfile provided as part of the challenge. To use our docker image you need to first:

Install Docker and follow the post installation steps.
Install nvidia-docker.

To build the docker image, run the following:

cd docker
docker build -f Dockerfile.tf-py36.cpu -t tf-py36 .
cd ..

There is also a GPU version of the image. The rest of the steps in this recipe can be run in a container in interactive mode. Start the docker image with the required data directories mounted:

docker run \
    -v ~/endgame/datasets/zerospeech2019/shared/databases/english/:/data/english \
    -v "$(pwd)":/home -it -p 8887:8887 tf-py36

To run on a GPU, --runtime=nvidia is additionally required.

To directly start a Jupyter notebook in a container, run:

docker run --rm -it -p 8889:8889 \
    -v ~/endgame/datasets/zerospeech2019/shared/databases/english/:/data/english \
    -v "$(pwd)":/home \
    tf-py36 \
    bash -c "ipython notebook --no-browser --ip=0.0.0.0 --allow-root --port=8889"

and then open http://localhost:8889/ in a browser.

Preliminaries

If you are not using the docker image, install all the standalone dependencies (see Dependencies section below). Then follow the steps here. The docker image includes all these dependencies and GitHub repositories.

Clone the required GitHub repositories into ../src/ as follows:

mkdir ../src/  # not necessary using docker
cd ../src/
git clone https://github.com/jameslyons/python_speech_features
cd python_speech_features
python setup.py develop
cd ../../suzerospeech2019/

Feature extraction

Move to features/ and execute the steps in features/readme.md.

Consistency rules for this repository

British spelling for naming and documentation.
Use double quotes "..." for Python strings.

Dependencies

License

This code is distributed under the Creative Commons Attribution-ShareAlike license (CC BY-SA 4.0).

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
FFTNet		FFTNet
conventional_tts/baseline		conventional_tts/baseline
data		data
docker		docker
evaluation		evaluation
features		features
notebooks		notebooks
samediff		samediff
src		src
wavenet		wavenet
.gitignore		.gitignore
Makefile		Makefile
__run_docker_bash.sh		__run_docker_bash.sh
__run_docker_notebook.sh		__run_docker_notebook.sh
paths.py		paths.py
readme.md		readme.md

vyraun/suzerospeech2019

Folders and files

Latest commit

History

Repository files navigation

Stellenbosch University ZeroSpeech 2019 System

Overview

Disclaimer

Repository structure

Docker

Preliminaries

Feature extraction

Consistency rules for this repository

Dependencies

License

About

Resources

Stars

Watchers

Forks

Languages