Speech to Text Testing Tool

This app transcribes and tests audio files with Microsoft, Amazon & Google's speech to text services, to help compare accuracy and performance.

Getting started

Select your service by navigating to the appropriate folder

cd Amazon-Transcribe

cd Google-Speech-to-Text-API

cd Microsoft-Speech-SDK

Each service has its own README containing set up and usage instructions, found in their specific directory.

File structure

Make sure the path to these files are correct in settings.py before running

In each service's directory, there is a results folder and contains the following files:
- ref.txt: stores the original transcript of the audio file you want to transcribe. Enter the original transcript before running the app
- hyp.txt: result of the transcription is stored here once the app is run and transcription is generated
- results.csv: results are stored here once generated (transcripts, WER & word error count)
- table.txt: WER and the word error count results are stored here
- alltranscriptions.txt: all text that has been transcribed is stored here

Testing structure

This app measures the accuracy of transcriptions using word error rate (WER).

Word Error Rate (WER), is a method to measure the performance of automated speech recognition (ASR). It compares the original transcript (reference) with the transcribed text (hypothesis) from a speech-to-text service.

WER does have its pros and cons but overall it provides a baseline accuracy metric for general use, in the form of a percentage.

Usage

Each app supports single and batch processing. With batch, an average of results are automatically calculated

Select your service by navigating to the appropriate folder. You can find README's with specific information there
Make sure to have understood and completed the prerequisites
Gather audio samples. I recommend creating a sounds folder and placing audio files there
Install all required dependencies by executing
```
npm run setup
```
Run each app by executing
```
npm start
```
Analyse results (table.txt & results.csv)

Further info can be found in each services README's

General Info

To use each tool, you will require an account with your service of choice. Each of the services are paid but all offer a free trial period
For each service, audio files are required to be in a specific format. Details of this can be found in each projects README
Both the original transcript (ref.txt) and the transcribed text (hyp.txt) are 'cleaned' to have consistent stylistic formats before WER is calculated. For example, digits like 1, 64 and 3000 are converted to their corresponding words: one, sixty-four and three thousand, respectively. Punctuation and unnecessary whitespace is also removed
You may have to change stylistic differences like "street" and "st" yourself to be consistent with transcription service

Find out more about this project and our findings in our blog

Acknowledgments

Applied Innovation - Kainos

Disclaimer

This project was developed using:

python 3.7.4, python modules version as described in requirements.txt
Node js v10.16.0, npm packages as described in package.json

Software versions are subject to change with new releases, to ensure the project runs smoothly without alteration the above versions should be used. This software was last ran on 09/2019.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Amazon-Transcribe		Amazon-Transcribe
Google-Speech-to-Text-API		Google-Speech-to-Text-API
Microsoft-Speech-SDK		Microsoft-Speech-SDK
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amazon-Transcribe

Amazon-Transcribe

Google-Speech-to-Text-API

Google-Speech-to-Text-API

Microsoft-Speech-SDK

Microsoft-Speech-SDK

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Speech to Text Testing Tool

Getting started

File structure

Testing structure

Usage

General Info

Acknowledgments

Disclaimer

About

Releases

Packages

Contributors 2

Languages

KainosSoftwareLtd/Speech-to-Text-Testing-Tool

Folders and files

Latest commit

History

Repository files navigation

Speech to Text Testing Tool

Getting started

File structure

Testing structure

Usage

General Info

Acknowledgments

Disclaimer

About

Resources

Stars

Watchers

Forks

Languages