Large-Scale End-to-End Multilingual Speech Recognition and Language Identification with Multi-Task Learning Demo

Overview

This repository contains demo codes for the paper Large-Scale End-to-End Multilingual Speech Recognition and Language Identification with Multi-Task Learning. It consists of the trained models, the python inference codes, and a simple frontend webpage as well as a backend nodejs server.

Use the demo with the frontend webpage

clone from https://github.com/Porridge144/sup-mlt-demo.git
cd model_export and run python pyonnxrt.py (you might need to run it in background or in a tmux window as it is blocking)
cd server and run node server.js (you can change the listening port to an arbitrary one in server.js)

Direct inference without using the frontend webpage

clone from https://github.com/Porridge144/sup-mlt-demo.git
put intended mp3/wav s into model_export/feat_extract/preprocdir/rawmp3
cd model_export and run python pyonnxrt.py (you might need to run it in background or in a tmux window as it is blocking)
cd model_export/feat_extract and run bash run.sh
output will be saved in the server and also printed in the terminal which pyonnxrt.py is running

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
model_export		model_export
server		server
LICENSE		LICENSE
README.md		README.md