Skip to content

xzm2004260/malaya-speech

 
 

Repository files navigation

logo

Pypi version Python3 version MIT License


Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.

Documentation

Proper documentation is available at https://malaya-speech.readthedocs.io/

Installing from the PyPI

CPU version :

$ pip install malaya-speech

GPU version :

$ pip install malaya-speech-gpu

Only Python 3.6.0 and above and Tensorflow 1.15.0 and above are supported.

We recommend to use virtualenv for development. All examples tested on Tensorflow version 1.15.4 and 2.4.1.

Features

  • Age Detection, detect age in speech using Finetuned Speaker Vector Malaya-Speech models.
  • Speaker Diarization, diarizing speakers using Pretrained Speaker Vector Malaya-Speech models.
  • Emotion Detection, detect emotions in speech using Finetuned Speaker Vector Malaya-Speech models.
  • Gender Detection, detect genders in speech using Finetuned Speaker Vector Malaya-Speech models.
  • Language Detection, detect hyperlocal languages in speech using Finetuned Speaker Vector Malaya-Speech models.
  • Noise Reduction, reduce multilevel noises using Pretrained STFT UNET Malaya-Speech models.
  • Speaker Change, detect changing speakers using Finetuned Speaker Vector Malaya-Speech models.
  • Speaker overlap, detect overlap speakers using Finetuned Speaker Vector Malaya-Speech models.
  • Speaker Vector, calculate similarity between speakers using Pretrained Malaya-Speech models.
  • Speech Enhancement, enhance voice activities using Pretrained Waveform UNET Malaya-Speech models.
  • Speech-to-Text, End-to-End Speech to Text using RNN-Transducer Malaya-Speech models.
  • Super Resolution, Super Resolution 4x using Pretrained Super Resolution Malaya-Speech models.
  • Text-to-Speech, using Pretrained Tacotron2 and FastSpeech2 Malaya-Speech models.
  • Vocoder, convert Mel to Waveform using Pretrained MelGAN, Multiband MelGAN and Universal MelGAN Vocoder Malaya-Speech models.
  • Voice Activity Detection, detect voice activities using Finetuned Speaker Vector Malaya-Speech models.
  • Voice Conversion, Many-to-One, One-to-Many, Many-to-Many, and Zero-shot Voice Conversion.
  • Hybrid 8-bit Quantization, provide hybrid 8-bit quantization for all models to reduce inference time up to 2x and model size up to 4x.

Pretrained Models

Malaya-Speech also released pretrained models, simply check at malaya-speech/pretrained-model

References

If you use our software for research, please cite:

@misc{Malaya, Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow,
  author = {Husein, Zolkepli},
  title = {Malaya-Speech},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/huseinzol05/malaya-speech}}
}

Acknowledgement

Thanks to KeyReply for sponsoring private cloud to train Malaya-Speech models, without it, this library will collapse entirely.

logo

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 93.7%
  • C++ 3.8%
  • Python 1.4%
  • HTML 0.6%
  • Makefile 0.3%
  • Shell 0.2%