GitHub - jyp0716/speaker-recognition-papers: Share some recent speaker recognition papers and their implementations.

Introduction

These are the slightly modified tensorflow/python implementation of recent speaker recognition papers. Please tell me if it is copyright infringement, I'll delete these paper as soon as I can. Our license only apply to our code these paper is not included. Thx.

The file structure is as follows：

|———pyasv
|
|—————model (folder, contain the model)
|
|—————loss (folder, contain the customized loss function)
|
|—————papers (folder, contain the origin paper of most of method)
|
|—————backend(TODO: folder, contain the method of backend)
|
|———data_manage.py (contain some method to manage data)
|
|———speech_processing.py (contain some method to extractfeature and process audio)
|
|———config.py (settings. e.g. save path, learning rate)

More info: Doc

If you want run these code on your computer, you only need to write code like this:

from pyasv import Config
from pyasv.speech_processing import ext_mfcc_feature
from pyasv.data_manage import DataManage
from pyasv.model.ctdnn import run

config = pyasv.Config(name='my_ctdnn_model',
                    n_speaker=1e3,
                    batch_size=64,
                    n_gpu=2,
                    max_step=100,
                    is_big_dataset=False,
                    url_of_bigdataset_temp_file=None,
                    learning_rate=1e-3,
                    slide_windows=[4, 4]
                    save_path='/home/my_path')
config.save('./my_config_path')

frames, labels = ext_mfcc_feature('data_set_path', config)
train = DataManage(frames, labels, config)

frames, labels = ext_mfcc_feature('data_set_path', config)
validation = DataManage(frames, labels, config)

run(config, train, validation)

TODO

Implement papers of ICASSP 2018 & Interspeech 2018.
Compare each model on a same dataset.

Implemented papers:

L. Li, Z. Tang, D. Wang, T. Zheng, "Deep Speaker Feature Learning for Text-Independent Speaker Verification."
L. Li, Z. Tang, D. Wang, T. Zheng, "Full-info Training for Deep Speaker Feature Learning," ICASSP 2018.
C. Li, X. Ma, B. Jiang, X. Li, X. Zhang, X. Liu, Y. Cao, A. Kannan, Z. Zhu, "Deep Speaker: an End-to-End Neural Speaker Embedding System."
Sergey Novoselov, Oleg Kudashev, Vadim Shchemelinin, Ivan Kremnev, Galina Lavrentyeva, "DEEP CNN BASED FEATURE EXTRACTOR FOR TEXT-PROMPTED SPEAKERRECOGNITION."

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
docs		docs
papers		papers
pyasv		pyasv
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs

docs

papers

papers

pyasv

pyasv

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE.txt

LICENSE.txt

README.md

README.md

Repository files navigation

Introduction

TODO

Implemented papers:

About

Releases

Packages

Languages

License

jyp0716/speaker-recognition-papers

Folders and files

Latest commit

History

Repository files navigation

Introduction

TODO

Implemented papers:

About

Resources

License

Stars

Watchers

Forks

Languages