This utility helps to recognize speech from audio files through multiple services. Right now STTMultiservice can work with:
- Google Cloud Speech-To-Text
- Yandex SpeechKit
- Wit.ai
STTMultiservice main purpose is to recognize dialogs from phone calls.
Please note, this app requires Python 3 (tested with Python 3.6 and 3.7) and won't run on versions, that are earlier.
You will need libmagic for STT multiservice. Read the installation article here.
STTMultiservice also requires ffmpeg. Consider this article to install it.
Clone this repository using command:
$ git clone {REPO_URL_HERE}
Specify environment variables (for example, in .bash_profile):
GOOGLE_APPLICATION_CREDENTIALS - path to JSON file from Google Cloud
GOOGLE_APPLICATION_PROJECT_NAME - Google Cloud project name
YANDEX_ASR_SERVICE_ACCOUNT_ID - service account id from Yandex Cloud
YANDEX_ASR_KEY_ID - key id, obtained for specified service account
YANDEX_ASR_FOLDER_ID - folder id, for which have permission to read and write specified service account
YANDEX_ASR_PRIVATE_CERT - path to private certificate for specified service account (PEM file)
WIT_ASR_ACCESS_TOKEN - access token for Wit.ai application
To get JSON file with credentials read this article. You only need to follow first step of "Before you begin".
To get all required credentials from Yandex consider this article. Follow steps for service account, right now STTMultiservice works only with it. You just need to get required credentials and private certificate. STTMultiservice will obtain IAM token by itself.
Follow this article to create application. Please note, you can specify language only in your application for Wit recognition. Next, get "Server Access Token " on your application's settings page.
Please note - you only need to configure services, which you will use.
Make sure, that your default interpreter is Python 3. In another cases add "python3" (or something like this) in the beginning of the command. To run recognition use this command:
$ ./recognizer.py -f=PATH_TO_FILE_FOR_RECOGNITION
To get more help run
$ ./recognizer.py --help
Right now you can only get answer in JSON format.